Programming Language Data Types and Binary Files

In today’s post I will discuss programming language data structure/types. This post highlights another accomplishment of building a micro-resolution ultrasonic imaging system. First, I successfully completed a plot script that will read binary data that was recorded by a data acquisition card and plot the curve using python. Now, some would agree that this is not a big feat, however this took me over three months to complete. The complexity of the journey first started with understanding what I was reading from the data acquisition card. I am using a Gage Razor Max 4 channel, 16-bit, 1G/s, high-speed card (try saying that five times fast). The card’s high-speeds are attributed to 1) it plugging directly into the motherboard via a PCIe slot and 2) it being able record data from a source in binary. Binary can take the shape of many data types but my data type was specifically 2 bytes for each sample point based on the manufacturer’s specification sheet. Therefore, when you try to open up the data using a traditional text editor it looks foreign.

After learning that each data point that was stored is 2 bytes, I had to figure out how to read the data successfully and plot it out to later be used for post processing. MATLAB does a great job of handling binary data. I was able to write a script consisting of 5 lines of code that allows the user to open a binary file, read the data, and output the data on a plot. I was able to confirm that the data generated from the MATLAB script was indeed the data I needed for ultrasonic testing by cross checking it with a digital oscilloscope. A key aspect that is not to be glossed over (because this is a “data structures” post) is you have to specify to MATLAB that the data to be plotted must be “int-16”. The “int-16” represents the 16-bit number structure that can be used to decode the binary data. If you do not do this then MATLAB will read the binary bytes in a different way. Thus, leading to the plot and data handling to be incorrect for my application.

So all was good on the binary data. However, I was using python to run my stages and I wanted to solely use one programming language to drive my stages for my micro-resolution ultrasonic system and process my data. This rare request from myself stems from me wanting to create a graphical user interface (GUI) that will plot the real time ultrasonic amplitudes and create a C-scan based on the comparison from the incident ultrasonic wave and the transmitted ultrasonic wave from the system.

This put me on a new quest to understand how to open a binary file, read binary data, and plot the data in real time. Now, we will put a pin in plotting real-time data and save that for a later date to handle. I simply wanted to recreate the amplitude versus time data that was successfully completed in the MATLAB script. I tackled my problem in small chunks. First, I learned how to open a binary file. This step was simply because there are many examples online that stated how to do complete this task. Next, I learned how to read the data from the binary file. Again this task was easy to complete but I did not know if the file was successfully read. I had to incorporate checking the input of the program by printing portions of the bytes that were stored in the python file. I then compared the numbers to the outputs of the MATLAB script and the digital oscilloscope. However, I kept running into an issue because python data handling is different from MATLAB’s. Therefore, I had to figure out how to specify my binary data in python to be the 16-bit type like in MATLAB.

After consistent searching day in and day out I finally consulted the official python website (www.python.org). In the documentation there was a section that broke down the different data structures that can be used in python. But, they did not look anything like MATLAB’s. It was not until a student that I am advising noticed the plot did not include negative numbers. Once, she pointed this out to me, I then remembered that I can change the data type ‘H’ to ‘h’. Voila! The ‘h’ data type did the trick and I am now able to plot the amplitude versus time data from my ultrasonic transducer in python.

Now I would be remiss if I did not explain what the ‘H’ and ‘h’ data structures are in python. Based on my research the ‘H’ and ‘h’ data structure correspond to unsigned short and short, respectively. The short data type is known as a 16-bit signed two’s complement integer that ranges from -32,768 to 32,767. The unsigned short data type is known as 16-bit unsigned integer that ranges from 0 to 65,535. Both data types are 2 bytes in size but as you can see one includes negative numbers and the other does not.

I hope you found this post valuable. Thank you for your time!

-DB PhD