How to Determine File Encoding in Mac OS by Command Line

Sep 2, 2017 - 5 Comments

Terminal in macOS

You can determine a files encoding and character set through the command line in Mac OS (and linux) by using the “file” command, which helps to retrieve general and specific information about a file type.

This probably won’t be a relevant tip to many users, but if you’re required to be working with a specific character set for something or need to know what a file type, encoding, or character set of an inputted item is by way of the command line, then this will do the trick.

The file command works in Mac OS and Mac oS X as well as linux and many other unix variations, making this trick helpful for scripts and other similar purposes too.

Determining File Encoding & Character Set via Command Line in Mac OS

The basic syntax is as follows:

file -I (input file)

(In case it wasn’t obvious, that’s a capital “i” as the flag as in -I, not a lowercase L)

Hitting return with a proper file name as the input will reveal a character set like UTF-8, us-ascii, binary, 8bit, etc.

For example, let’s say we’re checking the character set and file encoding of a file named “text.txt” then the syntax would look as so:

$ file -I text.txt
text.txt: text/plain; charset=unknown-8bit

With “text/plain” being the file type and “unknown-8bit” being the character set file encoding.

You can also issue the file command on literally any other file, be it images, archives, executables, or anything else you want to point the command at. This can be nice if you’re automating something to detect a file type to then run an appropriate command, perhaps after a file has been downloaded with curl and the archive type needs to be determined before a proper command can be executed.

$ file -I DownloadedFile.zip
DownloadedFile.zip: application/zip; charset=binary

There are many other uses for checking character set, file encoding, and file type through the command line with the ‘file’ command, and the -I flag is just one of a wide variety of options available. Check out the manual page for file to learn more if interested, and don’t forget to check out our many other command line tips (or list all terminal commands available on the Mac and have a little fun).

Do you know of another or better way to check file encoding and character set via the command line in Mac OS? Let us know in the comments!

.

Related articles:

Posted by: Paul Horowitz in Command Line, Mac OS, Tips & Tricks

5 Comments

» Comments RSS Feed

  1. JohnH says:

    It’s literally impossible to reverse engineer a file’s encoding if the author of the file doesn’t tell you how they encoded it. Tools like this, and others like “chardet” are only guessing the encoding.

  2. Dave Campbell says:

    awesome tip – this really helped me get just the answer i needed.

    importing a file into a sqllite tool – i had to specify the encoding. this tip got me the EXACT answer i needed with no bull*t!

    bravo!

  3. JanD says:

    Yeah, but … “charset=unknown-8bit”. How useful is that?

    @Jens: No, the correct option is -I (uppercase i). Do “$ man file”.

  4. Jens says:

    The flag is a lowercase I, not an uppercase i.

  5. KenJ says:

    I really like tips like this. The more mac oriented and informative, lesser known, less basic the better.

    Sadly everywhere else these days it’s nothing but iPhones etc… On the off chance other sites reference Mac they do so as if they weren’t even computers but an extension of the iPhone. :/

Leave a Reply

 

Shop on Amazon.com and help support OSXDaily!

Subscribe to OSXDaily

Subscribe to RSS Subscribe to Twitter Feed Follow on Facebook Subscribe to eMail Updates

Tips & Tricks

News

iPhone / iPad

Mac

Troubleshooting

Shop on Amazon to help support this site