master
Ondřej Hruška 6 years ago
parent 6e2c9aa5d2
commit 2152413dfe
Signed by: MightyPork
GPG Key ID: 2C5FD5035250423D
  1. 74
      ch.fat16.tex
  2. 2
      ch.usb.tex
  3. 62
      thesis.bib
  4. BIN
      thesis.pdf

@ -1,6 +1,6 @@
\chapter{The FAT16 File System and Its Emulation} \label{sec:fat16}
A \gls{FS} is used by GEX to provide a comfortable access to the configuration files. By emulating a Mass Storage \gls{USB} device, the module appears as a thumb drive on the host \gls{PC}, and the user can edit its configuration using their preferred text editor. The FAT16 file system was selected for its simplicity and a good cross-platform support.
A \gls{FS} is used by GEX to provide a comfortable access to the configuration files. By emulating a Mass Storage \gls{USB} device, the module appears as a thumb drive on the host \gls{PC}, and the user can edit its configuration using their preferred text editor. The FAT16 file system was selected for its simplicity and a good cross-platform support \cite{os-support-table}.
Three variants of the \gls{FAT} file system exist: FAT12, FAT16, and FAT32. FAT12 was used on floppy disks and it is similar to FAT16, except for additional size constraints and a \gls{FAT} entry packing scheme. FAT16 and FAT32 are FAT12's later developments from the time when hard disks became more common and the old addressing scheme couldn't support their larger capacity.
@ -8,6 +8,8 @@ This chapter will explain the structure of FAT16 and the challenges faced when t
\section{The General Structure of the FAT File System}
An overview will be presented here without going into too many details that would overwhelm the reader and can be looked up elsewhere. Several resources \cite{ms-fat,fat16-brainy,fat16-maverick,fat16-phobos,fat-whitepaper} were consulted during the development of the GEX firmware which provide a more complete description of the FAT16 file system, with \cite{fat-whitepaper}, the Microsoft white paper, giving the most detailed overview.
The storage medium is organized into \textit{sectors} (or \textit{blocks}), usually 512 bytes long. Those are the smallest addressing unit in the disk structure. The disk starts with a \textit{boot sector}, also called a \gls{MBR}). That is followed by optional reserved sectors, one or two copies of the file allocation table, and the root directory. All disk areas are aligned to a sector boundary:
\begin{table}[h]
@ -29,7 +31,7 @@ The storage medium is organized into \textit{sectors} (or \textit{blocks}), usua
\subsection{Boot Sector}
This is a 1-sector structure which holds the \gls{OS} bootstrap code for bootable disks. The first 3 bytes are a jump instruction to the actual bootstrap code located later in the sector. What matters to us when implementing the file system is that the boot sector also contains data fields describing how the disk is organized, what file system is used, who formatted it, etc. The size of the \gls{FAT} and the root directory is defined here. The exact structure of the boot sector can be found in XXX\todo{add ref link} or in the attached GEX source code.
This is a 1-sector structure which holds the \gls{OS} bootstrap code for bootable disks. The first 3 bytes are a jump instruction to the actual bootstrap code located later in the sector. What matters to us when implementing the file system is that the boot sector also contains data fields describing how the disk is organized, what file system is used, who formatted it, etc. The size of the \gls{FAT} and the root directory is defined here. The exact structure of the boot sector can be found in either of \cite{ms-fat,fat16-brainy,fat16-maverick,fat16-phobos,fat-whitepaper} or in the attached GEX source code.
\subsection{File Allocation Table}
@ -37,19 +39,19 @@ The data area of the disk is organized in clusters, logical allocation units com
The \gls{FAT} acts as a look-up table combined with linked lists. In FAT16, it is organized in 16-bit fields, each corresponding to one cluster. The first two entries in the allocation table are reserved and hold special values set by the disk formatter and the host \gls{OS}: a ``media descriptor'' 0xFFF8 and a ``clean/dirty flag'' 0xFFFF/0x3FFF.
Files can span multiple clusters; each \gls{FAT} entry either holds the address of the following file cluster, or a value with a special meaning:
Files can span multiple clusters; each \gls{FAT} entry either holds the address of the following file cluster, or a special value:
\begin{itemize}
\item 0x0000 - free cluster
\item 0xFFFF - last cluster of the file (still including file data!)
\item 0xFFF7 - bad cluster
\begin{itemize}[nosep]
\item 0x0000 \dots free cluster
\item 0xFFFF \dots last cluster of the file (still including file data)
\item 0xFFF7 \dots bad cluster
\end{itemize}
The bad cluster mark, 0xFFF7, is used for clusters which are known to corrupt data due to a flaw in the storage medium, such us a bad memory cell.
\subsection{Root Directory}
The root directory has the same structure as any other directories, which reside in clusters the same way like ordinary files. The difference is that the root directory is allocated when the disk is formatted and it has a fixed and known position and size. Sub-directories are stored on the disk in a similar way to regular files, therefore they can span multiple sectors and their file count can be much larger than that of the root directory.
The root directory has the same structure as any other directories, which reside in clusters the same way like ordinary files. The difference is that the root directory is allocated when the disk is formatted and it has a fixed and known position and size. Sub-directories are stored on the disk in a way similar to regular files, therefore they can span multiple sectors and their file count can be much larger than that of the root directory.
\begin{table}
\centering
@ -70,16 +72,15 @@ The root directory has the same structure as any other directories, which reside
\caption{\label{tab:fat16-dir-entry}Structure of a FAT16 directory entry}
\end{table}
A directory is organized in 32-byte entries representing individual files. Table \ref{tab:fat16-dir-entry} shows the structure of one such entry.
The name and extension fields form together the well-known 8.3 filename format. Longer file names are encoded using a \gls{LFN} scheme as special hidden entries stored in the directory table alongside the regular 8.3 entries, ensuring backward compatibility.
A directory is organized in 32-byte entries representing individual files. Table \ref{tab:fat16-dir-entry} shows the structure of one such entry. The name and extension fields form together the well-known 8.3 filename format (referring to the byte size of the first two entries). Longer file names are encoded using a \gls{LFN} scheme \cite{fat-lfn} as special hidden entries stored in the directory table alongside the regular 8.3 entries, ensuring backward compatibility.
\noindent
The first byte of the file name has a special meaning:
\begin{itemize}
\item 0x00 - indicates that there are no more files when searching the directory
\item 0xE5 - marks a free slot; this is used when a file is deleted
\item 0x05 - indicates that the first byte should actually be 0xE5, which was used in a Japanese character set.
\item 0x05 - indicates that the first byte should actually be 0xE5, a code used in some character sets at the time, and the slot is \textit{not} free\footnote{The special meaning of 0xE5 appears to be a correction of a less than ideal design choice earlier in the development of the file system}.
\item Any other value, except 0x20 (space) and characters forbidden in a DOS file name, starts a valid file entry. Generally, only space, A-Z, 0-9, \verb|-| and \verb|_| should be used in file names for maximum compatibility.
\end{itemize}
@ -96,14 +97,18 @@ Figure \ref{fig:fat-example} shows a possible organization of the GEX file syste
\section{FAT16 Emulation}
The FAT16 file system is relatively straightforward and easy to implement. This is the reason why an emulation driver for it was developed as part of the open-source ARM mbed DAPLink project. \todo{reference} It is used there for a drag-and-drop flashing of firmware images to the target microcontroller, taking advantage of it working well across different host platforms. ARM mbed uses a browser-based \gls{IDE} and cloud build servers, thus the end user does not need to install or set up any software or drivers to program a compatible development kit. The GEX firmware adapts several parts of this code, optimizes its \gls{RAM} usage and further expands its functionality to support our specific use case.
The FAT16 file system is relatively straightforward to implement. However, it is not practical or even possible to keep the entire file system in memory on a small microcontroller like our STM32F072. This means that we have to generate and parse disk sectors and clusters on-demand, when the host reads or writes them. The STM32 \gls{USB} Device library helpfully implements the \gls{MSC} and provides \gls{API} endpoints to which we connect our file system emulator. Specifically, those are requests to read and write a sector, and to read disk status and parameters, such as its size.
\subsection{DAPLink Emulator}
It is not practical or even possible to keep the entire file system in memory, especially with a microcontroller like the STM32F072, which has only 16\,kB of \gls{RAM} in total. This means that we have to generate and parse disk sectors and clusters on-demand, when the host reads or writes them. The STM32 \gls{USB} Device library helpfully implements the \gls{MSC} and provides \gls{API} endpoints to which we connect our file system emulator. Specifically, those are requests to read and write a sector, and to read disk status and parameters, such as its size.
A FAT16 emulator was developed as part of the open-source Arm Mbed DAPLink project \cite{daplink}. It is used there for a drag-and-drop flashing of firmware images to the target microcontroller, taking advantage of the inherent cross-platform support (it uses the same software driver as any thumb drive, as discussed in \ref{sec:msc}). Arm Mbed also uses a browser-based \gls{IDE} and cloud build servers, thus the end user does not need to install or set up any software to program a compatible development kit.
As shown in table \ref{tab:fat16-disk-areas}, the disk consists of several areas. The boot sector is immutable and can be stored in Flash. The handling of the other areas (\gls{FAT}, data area) depends on whether we're dealing with a read or write access:
The GEX firmware adapts several parts of the DAPLink code, optimizing its \gls{RAM} usage and porting it to work with FreeRTOS. Those modified files are located in the folder \mono{User/vfs} of the GEX source code repository; the original Apache 2.0 open source software license headers, as well as file names, have been retained.
\subsection{Handling a Read Access}
As shown in table \ref{tab:fat16-disk-areas}, the disk consists of several areas. The boot sector is immutable and can be stored in and read from the Flash memory. The handling of the other disk areas (\gls{FAT}, data area) depends on the type of access: read or write.
The user can only read files that already exist on the disk, in our case, \verb|UNITS.INI| and \verb|SYSTEM.INI|. Those files are generated from the binary settings storage, and conversely, parsed, line-by-line, without ever existing in their full form. This fact makes our task more challenging, as the files can't be easily measured and there's no obvious way to read a sector from the middle of a longer file. We solve this by implementing two additional functions in the INI file writer: a \textit{read window} and a \textit{dummy read mode}.
A read window is a byte range which we wish to generate. The INI writer discards bytes before the start of the read window, writes those inside the window to our data buffer, and stops when its end is reached. This lets us extract a sector from anywhere in a file. The second function, dummy read, is tied to the window function: we set the start index so high that it's never reached (e.g. 0xFFFFFFFF), and have the writer count discarded characters. When the dummy file generation ends, this character counter holds its size.
@ -112,9 +117,9 @@ Now, just one problem remains: how to tell which sectors contain which part of o
\subsection{Handling a Write Access}
A file write access is more challenging to emulate than a read access, as the host OS tends to be somewhat unpredictable. In GEX's case we're interested only in the action of overwriting an already existing file, but it's interesting to also analyze other actions the host may perform.
A write access to the disk is more challenging to emulate than a read access, as the host OS tends to be somewhat unpredictable. In GEX's case we are interested only in the action of overwriting an already existing file, but it is interesting to also analyze other actions the host may perform.
It must be noted that due to the nonexistence of a physical storage medium, it's not possible to read back a file the host has written. The \gls{OS} may show the written file on the disk, but when the user tried to read it, it either fails, or shows a cached copy. In the DAPLink emulator this is worked around by temporarily reporting that the storage medium has been removed, forcing the host to re-load its contents. In GEX, the loaded INI file will be a newly generated copy, embedding possible error messages as comments.
It must be noted that due to the nonexistence of a physical storage medium, it's not possible to read back a file the host has previously written, unless we store or re-generate its content when such a read access occurs. The \gls{OS} may show the written file on the disk, but when the user tried to read it, the action either fails, or shows a cached copy. The emulator woulds around this problem by temporarily reporting that the storage medium has been removed, forcing the host to re-load its contents.
\subsubsection{File Deletion}
@ -125,11 +130,11 @@ A file is deleted by:
\item Replacing the first character of its name in the directory table by 0xE5 to indicate the slot is free
\end{enumerate}
From the perspective of emulation, we can ignore the \gls{FAT} access and only detect writes to the directory sectors. This is slightly more complicated when one considers that all disk access is performed in sectors: the emulator must compare the written data with the original bytes to detect what change has been performed. Alternatively, we could parse the written sector as a directory table and compare it with our knowledge of its original contents.
From the perspective of emulation, we can ignore the \gls{FAT} access and only detect writes to the directory sectors. This is slightly more complicated when one considers that all disk access is performed in sectors: the emulator must compare the written data with the original bytes to detect what change has been performed. Alternatively, we could parse the entire written sector as a directory table and compare it with our knowledge of its original contents.
\subsection{File Name Change}
A file is renamed by modifying its directory entry. This can be be detected in a similar way to a file deletion. In the simple case of a short, 8.3 file name, this is a in-place modification of the file entry. Long file names, using the \gls{LFN} extension, are a complication, as the number of dummy entries might change when the file name is shortened or made longer, and subsequently the following entries in the table may shift or be entirely re-arranged.
A file is renamed by modifying its directory entry. In the simple case of a short, 8.3 file name, this is an in-place modification of the file entry. Long file names, using the \gls{LFN} extension, are a complication, as the number of non-file entries holding the long file name might change, and subsequently the following entries in the table may shift or be re-arranged.
\subsection{File Creation}
@ -149,38 +154,11 @@ The uncertain order of the written disk areas poses a problem when the file name
\subsection{File Content Change}
A change to a file's content is performed in a similar way to the creation of a new file, except instead of creating a new entry in the directory table, an existing one is updated with the new file size. The name of the file may be, again, unknown until the content is written, but we could detect the file by comparing the starting sector with those of all files known to the virtual file system.
A change to a file's content is performed in a similar way to the creation of a new file, except instead of creating a new entry in the directory table, an existing one is updated with the new file size. The name of the file may be unknown until the content is written, but we could detect the file name by comparing the start sector with those of all files known to the virtual file system.
In the case of GEX, the detection of a file name is not important; we expect only INI files to be written, and the particular file may be detected by its first section marker, such as \verb|[UNITS]| or \verb|[SYSTEM]|. Should a non-INI file be written by accident, the INI parser will likely detect a syntax error and discard it.
It should be noted that a file could be updated only partially, skipping the clusters which remain unchanged, and there's also no guarantee regarding the order in which the file's sectors are written. This is hard to detect and handle and the current firmware is not able to interpret it correctly, thus such a write operation will fail. Fortunately, this host behavior has not been conclusively observed in practice, but a write operation rarely fails for still unknown reasons and this could be a possible cause.
It should be noted that a file could be updated only partially, skipping the clusters which remain unchanged, and there is also no guarantee regarding the order in which the file's sectors are written. This is hard to detect and handle correctly, but it can be detected by the emulator and such a write operation will be discarded. Fortunately, this host behavior has not been conclusively observed in practice, but the writing of a file rarely fails for unknown reasons; this could be a possible cause.

@ -80,7 +80,7 @@ The V$_\mathrm{BUS}$ line supplies power to \textit{bus-powered} devices. \texti
This section explains the classes used in the GEX firmware. A list of all standard classes with a more detailed explanation can be found in \cite{usb-class-list}.
\subsection{Mass Storage Class}
\subsection{Mass Storage Class} \label{sec:msc}
The \gls{MSC} is supported by all modern operating systems (MS Windows, MacOS, GNU/Linux, FreeBSD etc.) to support thumb drives, external disks, memory card readers and other storage devices.

@ -113,6 +113,13 @@
urldate = {2018-05-12}
}
@online{os-support-table,
title = {Comparison of File Systems / OS Support},
author = {{Wikipedia contributors}},
url = {https://en.wikipedia.org/wiki/Comparison_of_file_systems#OS_support},
urldate = {2018-05-12}
}
@online{daplink,
title = {Arm Mbed DAPLink source code repository},
author = {{Arm Mbed}},
@ -145,6 +152,25 @@
urldate = {2018-05-12}
}
@article{fat-lfn,
author = {{``vinDaci''}},
title = {Long Filename Specification},
year = {1998},
url = {http://home.teleport.com/~brainy/lfn.htm},
urldate = {2018-05-12}
}
@techreport{fat-whitepaper,
author = MSFT,
title = {FAT: General Overview of On-Disk Format},
url = {https://staff.washington.edu/dittrich/misc/fatgen103.pdf},
year = {2000},
urldate = {2018-05-12}
}
% FreeRTOS
@online{freertos-ports-list,
title = {FreeRTOS Ports},
author = {{Real Time Engineers Ltd.}},
@ -167,7 +193,7 @@
}
@book{freertos-book,
title = {Mastering the FreeRTOS™ Real Time Kernel},
title = {Mastering the FreeRTOS™ Real Time Kernel},
subtitle = {A Hands-On Tutorial Guide},
author = {Richard Barry},
publisher= {Real Time Engineers Ltd.},
@ -177,7 +203,7 @@
}
@manual{freertos-rm,
title = {The FreeRTOS™ Reference Manual},
title = {The FreeRTOS™ Reference Manual},
author = {{Real Time Engineers Ltd.}},
publisher= {Real Time Engineers Ltd.},
url = {https://www.freertos.org/Documentation/FreeRTOS_Reference_Manual_V10.0.0.pdf},
@ -185,22 +211,6 @@
urldate = {2018-05-12}
}
@article{fat-lfn,
author = {{vinDaci}},
title = {Long Filename Specification},
year = {1998},
url = {http://home.teleport.com/~brainy/lfn.htm},
urldate = {2018-05-12}
}
@techreport{fat-whitepaper,
author = MSFT,
title = {FAT: General Overview of On-Disk Format},
url = {https://staff.washington.edu/dittrich/misc/fatgen103.pdf},
year = {2000},
urldate = {2018-05-12}
}
% STM32
@ -271,7 +281,7 @@
series={Ask The Application Engineer},
volume={33},
url={http://www.analog.com/media/en/analog-dialogue/volume-38/number-3/articles/all-about-direct-digital-synthesis.pdf},
year={2004}
year={2004}
}
@techreport{understanding-i2c,
@ -284,7 +294,7 @@
}
@manual{i2c-spec,
title = {I2C-bus specification and user manual},
title = {I2C-bus specification and user manual},
author = {{NXP Semiconductors}},
url = {https://www.nxp.com/docs/en/user-guide/UM10204.pdf},
year = {2014},
@ -292,7 +302,7 @@
}
@manual{nrf-manual,
title = {nRF24L01+ Single Chip 2.4GHz Transceiver Product Specification v1.0},
title = {nRF24L01+ Single Chip 2.4GHz Transceiver Product Specification v1.0},
author = {{Nordic Semiconductor}},
url = {http://www.nordicsemi.com/eng/content/download/2726/34069/file/nRF24L01P_Product_Specification_1_0.pdf},
year = {2008},
@ -300,7 +310,7 @@
}
@manual{semtech-manual,
title = {SX1276/77/78/79 datasheet},
title = {SX1276/77/78/79 datasheet},
author = {{Semtech Corporation}},
url = {https://www.semtech.com/uploads/documents/DS_SX1276-7-8-9_W_APP_V5.pdf},
year = {2016},
@ -309,7 +319,7 @@
@manual{f072-rm,
title = {RM0091: STM32F0x1/STM32F0x2/STM32F0x8 reference manual},
title = {RM0091: STM32F0x1/STM32F0x2/STM32F0x8 reference manual},
author = STM,
url = {http://www.st.com/resource/en/reference_manual/dm00031936.pdf},
year = {2017},
@ -317,7 +327,7 @@
}
@manual{f072-ds,
title = {STM32F072x8/STM32F072xB datasheet},
title = {STM32F072x8/STM32F072xB datasheet},
author = STM,
url = {http://www.st.com/resource/en/datasheet/stm32f072c8.pdf},
year = {2017},
@ -326,7 +336,7 @@
@manual{f103-rm,
title = {RM0008: STM32F101xx, STM32F102xx, STM32F103xx, STM32F105xx reference manual},
title = {RM0008: STM32F101xx, STM32F102xx, STM32F103xx, STM32F105xx reference manual},
author = STM,
url = {http://www.st.com/resource/en/reference_manual/cd00171190.pdf},
year = {2009},
@ -334,7 +344,7 @@
}
@manual{f103-ds,
title = {STM32F103x8/STM32F103xB datasheet},
title = {STM32F103x8/STM32F103xB datasheet},
author = STM,
url = {http://www.st.com/resource/en/datasheet/CD00161566.pdf},
year = {2015},

Binary file not shown.
Loading…
Cancel
Save