Hi,
Does anyone know of a utility that can batch convert Word documents to Unicode or UTF8 format text files? It needs to iterate through sub-directories and be capable of naming output text files with a specified file extension other than ".txt".
For instance... if I had directories and files as below:
rootdir\
........subdir1\
................file1.doc
................file2.doc
........subdir2\
................file1.doc
................file2.doc
... I could point it at directory rootdir and it would produce (for example) Unicode text files with file extension ".uni" as below:
rootdir\
........subdir1\
................file1.doc
................file1.uni
................file2.doc
................file2.uni
........subdir2\
................file1.doc
................file1.uni
................file2.doc
................file2.uni
So far I have tried the following with no success:
1/ An old command line utility called DOC2TXT.EXE which creates files of the correct names, in the correct places, but unfortunately the option to output as Unicode text does not work (the files created are Western European encoded, not Unicode). Is there anyone out there that has got this to work? If so, please tell me how.
2/ "Convert Doc" by Softinterface.inc. This actually creates good Unicode files in the right places, but insists on naming them with a ".txt" extension. When I email Softinterface to ask about this they consistently ignore my emails, which is a pity really....
3/ "WordConverterEXE" also from Softinterface, appears to be a GUI front end wrapped around the old DOC2TXT.EXE mentioned above, so fails in the same way.
4/ Something called EZ-doc2txt (I think) which also did not create Unicode output.
Any suggestions anybody?
Thanks in advance.