Synalyze It! Developer Weblog

Here you read what's storming in my brain...

Luac Decompiler based on Synalyze It! grammar

There are plenty of specialized tools that decompile .luac files — see http://lua-users.org/wiki/LuaTools.

Today Synalyze It! learnt to decode this new format with a grammar for compiled Lua 5.1 files. The advantage is that you see in the hex dump easily the meaning of each single byte.

Thank you to the generous user that shared this grammar and makes it available to all users!

Automatic Backups, Fixed Hex Editor Width…

Synalyze It! got even more the hex editor users love when working on binary files.

Version 1.8 is now released and several features were added that make hex editing more enjoyable both for power users as well as for casual tasks.

Now you can select if you want to create backups of the files you start changing automatically so you don’t have to be that careful anymore.

Many people will like that the number of columns in the hex editor can be fixed now. This is especially useful when editing files that have fixed record widths.

New scripting methods will help process the parsing results automatically with Python or Lua scripts. Additionally the processing speed of scripted data types has been improved dramatically.

Now that OS X Yosemite will be released in a foreseeable future a new update will probably follow soon that ensures you can always rely on your favorite OS X hex editor :-)

BSON files decoded

Everyone knows JSON (JavaScript Object Notation), the simple data interchange format. It allows to exchange data structures between various languages like Java, Ruby, Python, C# and, of course, JavaScript.

Since JSON is a text format that can be quite verbose, BSON was developed as a binary representation of JSON documents. BSON not only supports the well-known data structures (arrays and name/value pairs) but also has extensions like a Date type.

There is now a Synalyze It! BSON grammar available that allows to decode BSON files easily.

If you look for generic binary formats that allow to store data structures you may also be interested in Google protocol buffers or Apache Thrift. Both are supported by many programming languages and encode/decode data structures in a very compact form. Opposed to BSON these formats are more driven by a schema describing the structure and its contents.

Decode Named Binary Tag (NBT) Files

Minecraft uses an older file format called Named Binary Tag to save world data. It was created by Markus Alexej "Notch" Persson, the inventor of Minecraft, to save tagged data organized in a tree.

All tags are of a certain type, have a name and mostly carry some data. The order of the tags plays no role, they are identified by their name.

There are some versions of the NBT format:

  • Minecraft Indev used NBT with tags 0 to 10
  • Version 19132: First known version number - used in Minecraft Beta 1.3
  • Version 19133: Extension of the format by the integer array tag for the Anvil Format

Here you get a free Synalyze It! grammar to parse uncompressed NBT files.

HyperCard Stacks Decoded

Long before HTML conquered the world Apple released a hypertext system in 1987 named HyperCard. It was used for many purposes, even games and as a presentation software.

Originally it was created by Bill Atkinson and given to Apple under the condition that it must be made available for free on all Macs.

Later version 2 was released in 1990, HyperCard 3 was shown to the public in 1996 in an alpha-quality version on the Worldwide Developers Conference. Finally Apple stopped selling the system in 2004.

Now you can download a free grammar for Synalyze It! that decodes most of the HyperCard stack files. The file format can be easily displayed in a tree view.

Synalyze It! - Free Version Discontinued

Some may not appreciate it however I eventually decided not to continue the free version of Synalyze It!

This means that 1.4 was the last version offered for $0.

Let me explain why I chose to do this step. About 4 years ago when I started developing Synalyze It! I had no plan if, how or when to sell it in the future. I spent 100's of hours to make it the application I wished someone else would have written.

Several users called me crazy that I gave this piece of software away for free. In January 2011 the Mac App Store was launched and I took the chance to sell the more powerful Pro version while the other one remained free.

Later I gave the non-Pro version a price tag of $4.99 which is still a reasonable price however it never felt completely right because there was not much difference to the free version on this web site.

Stripping down the free version to a regular hex editor without the grammar features also wouldn't make much sense after offering all features and updates for a long time for free.

Grammars for Shapefiles

Shapefile is a very popular file format for storing geospatial vector data. To be precise, shapefiles consist of three mandatory file formats. One of them is the popular dBASE format which was very common in the MS-DOS times. (You remember? ;-)

There are now three new grammars available:

  • .shp - the main shape files with the vector data
  • .shx - contains an index for the shape file data
  • .dbf - the dBASE file with attributes of the shapes

Although the .shp and .shx formats are structured quite simple, they have an uncommon mixture of little and big endian data, even in the main file header. The rest of the file is stored in little endian byte order.

The side effect of having grammars for shape files is the ability to decode dBASE files, at least on record level.

Here you find free shapefiles of many countries:

If you're interested in more details of the shapefile format, please have a look at the specification.

What is Reverse Engineering?

If you are someone who likes looking behind the scenes you probably did reverse engineer things already when you were a child. Always if you try to find out how things work - be it a radio, a computer program or a car, you apply reverse engineering.

There are several definitions and explanations available, Wikipedia has an own article on the topic. Originally the term was used in mechanical engineering but Elliot J. Chikofsky and James H. Cross II applied it to software engineering.

While the license terms of software products often do not allow to analyze the binaries you find many cases where reverse engineering is not only legal but also fun and useful.

There are several specialized tools available that support you in understanding binary files, both executable and data. One of these tools on OS X is Hopper disassembler that provides deep insight to programs even if the source code is not available.

Apart from executable files there are thousands of different file formats out there which mostly can only be read by the software that produced them. Synalyze It! was developed mainly to support the process of reverse engineering of binary files and give easy access to the contents once the format is described in a "grammar".

Rescue of a WAV file

Not long ago a man contacted me and asked for help.

He had interviewed someone (audio+video), however the audio recording was interrupted at the end. Since the file was not readable by QuickTime and the interview couldn't be repeated you can imagine the situation in which he was.

Now, how to rescue a WAV file that wasn't written until the end?

Unfortunately the user didn't know anything about binary files, their formats and what could be wrong with an audio file. His first idea was to copy the end of a valid WAV file to the end of his interview to make it work.

So he sent me a short audio recording of 3 seconds to extract header and footer as a first step to repair his main interview file.

But… while WAV files indeed have a well-defined file header, there is no such footer - this means, the idea to fix the file end couldn't work! We had to take a different approach…

He then downloaded the free version of Synalyze It! and applied the grammar to his file. Because all the structures and their elements was Greek to him he proposed to send me the file.

Doxygen and DocBook

In the previous article I explained how to create a PDF file from DocBook XML. One big advantage of using XML is that you can easily connect it to other sources.

To integrate documentation produced by Doxygen from your source files tell Doxygen to produce XML output (get it via MacPorts - "port install doxygen"):

/opt/local/bin/doxygen doxygen.config

where GENERATE_XML = YES is set in your doxygen.config.

The next step uses xsltproc which you also get via MacPorts ("port install libxslt"). The many XML files have to be merged into one:

/opt/local/bin/xsltproc --output DoxygenCompound.xml combine.xslt index.xml

In order to use the single Doxygen XML file (DoxygenCompound.xml) you need to translate it to a valid DocBook file:

/opt/local/bin/xsltproc --output DocBookChapter.xml Doxygen2Docbook.xsl DoxygenCompound.xml

The Doxygen2Docbook.xsl was written by me and probably needs to be adapted to match your needs. Finally reference the resulting DocBookChapter.xml in your main DocBook file:

    <xi:include href="DocBookChapter.xml"/>

Synalyze It! Manual and DocBook

Thousands of people use Synalyze It! now, many of them every day to work on their binary files. Although there is an online help available, a more comprehensive manual was missing yet.

There are many ways how to create technical documentation. If your primary goal is to produce a PDF file, most word processors are well-suited for the job. However, if you want to target different formats and incorporate other input data, without any doubt DocBook is the best choice.

DocBook is an XML language that allows to tag almost everything that can occur in technical documents. As every good XML language it doesn't mix structure and presentation. So you describe for example a command (<command>ls</command>) as such or a keystroke Cmd-k (<keycombo><keycap>Cmd</keycap><keycap>K</keycap></keycombo>), there's no information how to display them.

The actual presentation is added later in the process so you can be sure it's applied consistently to the whole document. To produce PDF you usually produce an intermediate XSL-FO file that can be processed by formatters like FOP or any of the commercial ones.

What's going on?

Some months have passed since version 1.2 has been published. Many people sent many good ideas and I had to select which features will make it into Synalyze It! 1.3.

Partially Apple required some changes, for example sandboxing. But even for the non-Pro version some things are added like optionally showing of hidden files or package contents in the file open dialog or a Spotlight importer for grammar files.

Users of the Pro version will be delighted by Python scripting that allows to write custom data types (that even translate changes back to the file!), work on grammars or files displayed in the hex view. If you want to get an impression what will be possible, have a look at The Script Page.

As always: Any feedback is appreciated!

Andreas

More grammars arrived

ICC (color) profiles can be decoded now using the grammar you find on this page as well as a basic grammar for Audible (.aa) files.

A grammar for ELF binary files is on the way but needs some extensions in Synalyze It! which will be implemented in the next days.

Thanks to all users sharing their grammars for free!

Please consider providing your grammars also if they might be useful for others :)

All users noticing the bug when copying number elements with min/max values please download SynalyzeIt_1.0.3.2.zip

Enjoy your day

Andreas

New grammars arrived

With version 1.0.3 I uploaded a grammar for saved games of Borderlands. Today Pascal Werz sent me a grammar for Mach-O files that could be useful for several developers.

Enjoy & have a nice week :)

Andreas

Next step ahead

The next version of Synalyze It! is almost done... I hope.

Please have a look at SynalyzeIt_1.0.2.6.zip and report any bugs or whatever you think.

The version will come along with some new grammars for mach-o files and others :)

Have a great weekend!

Andreas

PowerPC is still alive ... a bit

Some days ago I installed Xcode 3.2.6 and unfortunately I built version 1.0.2 with it. Today a user noticed that the new version was not a complete universal binary anymore - the PPC part was missing.

Now the version is again a full universal binary - sorry for the confusion.

It seems that Apple tries to get rid of PPC quickly - with Lion even Rosetta will be dismissed...

Have a good <whatever you want>

Andreas

1.0.2 is out

Eventually Synalyze It! was released now also in the Mac App Store. Unfortunately the versions you find in the MAS and here on this site are not the same.

First, I had to remove some features due to MAS rules (synalyze shell tool and automatic online update as well as the automatic suggestion of grammars available on this site). Additionally in the version you get here are some fixes made while the version in the MAS waited for approval more than two weeks.

Hopefully I can keep both versions better on the same state in the future. 

Thanks again for all the positive feedback, version 1.0.3 is already in the works :)

Happy synalyzing!

Andreas


Waiting for the Mac App Store

While waiting for the approval of Synalyze It! in the Mac App Store I created a grammar for Windows EMF files.

The version in the Mac App Store will have no automatic online update check and also the downloading of grammars has to be done manually due to the Mac App Store rules.

As soon as the version 1.0.2 will be approved I'll also provide it on this site.

All the best...

Andreas

Ecoutez!

Since je ne parle pas francais very well I don't actually know what's been spoken in this french podcast of CocoaCast - could someone tell me more?

Localization of software can be really annoying so I wrote a Ruby script now to compare two UTF-16 encoded Strings files and a Rakefile that creates the German XIBs from the English ones using the strings.

The script is not yet really nice because of the redundant code but it does its job and tells me about missing strings in the localized version of the strings file.

Version 1.0.1 was just released and includes many enhancements and fixes. Please  report any problems you find :)

Happy synalyzing!

Andreas

Thank you all

Thanks for all the positive feedback! :-D

A lot has been improved and fixed in the past days. The most frequently asked extension - use a number element as repeat count of structures, has been implemented.

Please have a look before I release the next version and give me feedback.

Happy synalyzing!

Andreas