Which Excel file extension should I use?
5 stars based on
Included in this article are the basic structures and key concepts for interacting with this file format programmatically. February Provided by: What is excel binary book is the part of a series of articles that introduce the binary file formats used by Microsoft Office products.
Understanding Office What is excel binary book File Formats. Excel Binary File Format. The format is organized into streams and substreams. Each spreadsheet worksheet is stored in its own substream. All of the data is contained in records that have headers, which give the record type and length. Cell records, which contain the actual cell data as well as formulas and cell properties, reside in the cell table. String values are not stored in the cell record, but in a shared strings table, which the cell record references.
Row records contain property information for row and cell what is excel binary book. Only cells that contain data or individual formatting are stored in the substream. The recommended way to perform most programming tasks in Microsoft Excel is to use the Excel Primary Interop Assemblies. These are a set of. NET classes that provide a complete object model for working with Microsoft Excel. This article series deals only with advanced scenarios, such as where Microsoft Excel is not installed.
Records may be read or skipped by reading these values, then either reading or skipping the number of bytes specified by cb, depending on the record type specified by rt. A record cannot exceed bytes. If the data the record applies to is larger than that, the rest is stored in one or more continue records. For more information, see section 2. Specific byte locations within a record are counted from the end of the cb field. The Workbook stream is the primary stream in an. The first stream is always the Globals substream, and the rest are sheet substreams.
These include worksheets, macro sheets, chart sheets, dialog sheets, and VBA module sheets. The Globals substream specifies global properties and data in a workbook. It also includes a What is excel binary book record for what is excel binary book substream in the Workbook stream.
A BoundSheet8 record gives information about a sheet substream. This includes name, location, type, and visibility. The first 4 bytes of the record, the lbPlyPos FilePointer, specifies the position in the Workbook stream where the sheet substream starts.
The cell table is the part of a what is excel binary book stream where cells are stored. It contains a series of row blocks, each of which has a capacity of 32 rows of cells, and are filled sequentially.
Each row block starts with a series of Row records, followed by the cells that go in the rows, and ends with a DBCell record, which gives the starting offset of the first cell of each row in the block.
A Row record defines a row in a sheet. This is a complex structure, but only the first 6 bytes are needed for basic content retrieval. These give the row index and the columns of the first cells and last cells that contain data or unique formatting in the row.
All of the cells in a row block are stored after the last row in the block. There are seven kinds of records that represent actual cells in a worksheet. Most cell records begin with a 6-byte Cell structure. The first 2 of those bytes specify the row, the next 2 bytes specify the column, and the last 2 bytes specify an XF record in the Globals substream that contains formatting information.
The following records represent the different kinds of cells. Unless specified otherwise, the first 6 bytes are taken up by the cell structure, and the remaining bytes contain the value. A Blank cell record specifies a blank cell that has no formula or value. This record type is used only for cells that contain individual formatting; otherwise, blank cells are stored in MulBlank records or not at all.
What is excel binary book RK cell record contains a bit number. Excel automatically converts numbers that can be represented in 32 bits or less to this format for storage as a way to reduce file size. Instead of a 6-byte cell structure, the first 2 bytes specify the row and the second what is excel binary book bytes specify the column.
The remaining 6 bytes define the number in an RkRec structure for disk and memory optimization. A BoolErr cell record contains a 2-byte Bes structure that may what is excel binary book either a Boolean value or an error code. A Formula cell record contains both the formula and the resulting data. The value displayed in the cell is defined in a FormulaValue structure in the 8 bytes that follow the cell structure.
The next 6 bytes can be ignored, and the rest of the record is a CellParsedFormula structure that contains the formula itself. What is excel binary book first 2 bytes give the row, and what is excel binary book next 2 bytes give the column that the series of blanks starts at.
Next, a variable length array of cell structures follows to store formatting information, and the last 2 bytes show what column the series of blanks ends on. These values are referenced in the worksheet by LabelSst cell records. The first 8 bytes of the SST give the number of references to strings in the workbook and the number of unique string values in the SST. The rest is an array of XLUnicodeRichExtendedString structures that contain the strings themselves as arrays of characters.
Bit 16 of this structure specifies whether the characters are 1 byte or 2 bytes each. Although you could load every sheet substream indiscriminately, you gain more what is excel binary book and efficiency by using the BoundSheet8 records to locate just the sheets you want to read. Parsing of formulas and formatting information is beyond the scope of this article. Open the Workbook stream and scan for the first instance of a BOF record. This is the beginning of the Globals substream.
For more details, see Globals. From the BoundSheet8 record that corresponds to the substream you want to open, read the first 4 bytes, which contains the lbPlyPos FilePointer.
Go to the offset in the stream specified by the lbPlyPos FilePointer. This is the BOF record for the worksheet. Read the next record in the substream, which is what is excel binary book Index record, and load the array of pointers that starts at byte 16 what is excel binary book the Index record. Each pointer points to the stream position of a DBCell record. Go to the offset specified by the bytes 5—6 of the DBCell record and read into memory all of the cell records, starting at that point and what is excel binary book with the last byte before the DBCell.
Copy the cell records to the objects that you defined in your internal data structure by record type. By using the tools that are provided in this article, simple data recovery should be within your reach.
Technical Articles Understanding the Excel. Performance and Limit Improvements. Tips for Optimizing Performance Obstructions. Using Custom Data Parts in Excel Working with the Compatibility Checker in Excel and Excel Collapse the table of content. This documentation is archived and is not being maintained. June 23, Applies to: The following procedure shows how to access all of the data from a worksheet.
Note Specific byte locations within a record are counted from the end of the cb field. Create an internal data structure to hold the worksheet content. Define objects to represent each of the eight cell record types in memory.