How to Determine File Encoding in Mac OS by Command Line
You can determine a files encoding and character set through the command line in Mac OS (and linux) by using the “file” command, which helps to retrieve general and specific information about a file type.
This probably won’t be a relevant tip to many users, but if you’re required to be working with a specific character set for something or need to know what a file type, encoding, or character set of an inputted item is by way of the command line, then this will do the trick.
The file command works in Mac OS and Mac oS X as well as linux and many other unix variations, making this trick helpful for scripts and other similar purposes too.
Determining File Encoding & Character Set via Command Line in Mac OS
The basic syntax is as follows:
file -I (input file)
(In case it wasn’t obvious, that’s a capital “i” as the flag as in -I, not a lowercase L)
Hitting return with a proper file name as the input will reveal a character set like UTF-8, us-ascii, binary, 8bit, etc.
For example, let’s say we’re checking the character set and file encoding of a file named “text.txt” then the syntax would look as so:
$ file -I text.txt
text.txt: text/plain; charset=unknown-8bit
With “text/plain” being the file type and “unknown-8bit” being the character set file encoding.
You can also issue the file command on literally any other file, be it images, archives, executables, or anything else you want to point the command at. This can be nice if you’re automating something to detect a file type to then run an appropriate command, perhaps after a file has been downloaded with curl and the archive type needs to be determined before a proper command can be executed.
$ file -I DownloadedFile.zip
DownloadedFile.zip: application/zip; charset=binary
There are many other uses for checking character set, file encoding, and file type through the command line with the ‘file’ command, and the -I flag is just one of a wide variety of options available. Check out the manual page for file to learn more if interested, and don’t forget to check out our many other command line tips (or list all terminal commands available on the Mac and have a little fun).
Do you know of another or better way to check file encoding and character set via the command line in Mac OS? Let us know in the comments!
It’s literally impossible to reverse engineer a file’s encoding if the author of the file doesn’t tell you how they encoded it. Tools like this, and others like “chardet” are only guessing the encoding.
awesome tip – this really helped me get just the answer i needed.
importing a file into a sqllite tool – i had to specify the encoding. this tip got me the EXACT answer i needed with no bull*t!
bravo!
Yeah, but … “charset=unknown-8bit”. How useful is that?
@Jens: No, the correct option is -I (uppercase i). Do “$ man file”.
The flag is a lowercase I, not an uppercase i.
I really like tips like this. The more mac oriented and informative, lesser known, less basic the better.
Sadly everywhere else these days it’s nothing but iPhones etc… On the off chance other sites reference Mac they do so as if they weren’t even computers but an extension of the iPhone. :/