Common Identifiers
There are two key identifiers that are used across data sets.
ISBNs
We use ISBNs for a lot of data linking. In order to speed up ISBN-based operations, we map textual ISBNs to numeric ’ISBN IDs`.
book-links/all-isbns.parquet
This file manages ISBN IDs and their mappings, along with statistics about their usage in other records.
Column | Purpose |
---|---|
isbn_id | ISBN identifier |
isbn | Textual ISBNs |
Each type of ISBN (ISBN-10, ISBN-13) is considered a distinct ISBN. We also consider other ISBN-like things, particularly ASINs, to be ISBNs.
Additional fields in this table contain the number of records from different sources that reference this ISBN.
File details
Field
|
Type
|
---|---|
isbn_id
|
Int32
|
isbn
|
Utf8
|
LOC
|
UInt32
|
OL
|
UInt32
|
GR
|
Int64
|
AZ14
|
UInt32
|
AZ18
|
UInt32
|
Many other tables that work with ISBNs use ISBN IDs.
Book Codes
We also use book codes, common identifiers for integrated ‘books’ across data sets. These are derived from identifiers in the various data sets. Each book code source is assigned to a different 100M number band (a ‘numspace’) so we can, if needed, derive the source from a book code.
Source | Namespace Object | Numspace |
---|---|---|
OL Work | NS_WORK |
100M |
OL Edition | NS_EDITION |
200M |
LOC Record | NS_LOC_REC |
300M |
GR Work | NS_GR_WORK |
400M |
GR Book | NS_GR_BOOK |
500M |
LOC Work | NS_LOC_WORK |
600M |
LOC Instance | NS_LOC_INSTANCE |
700M |
ISBN | NS_ISBN |
900M |
The bookdata::ids::codes
module contains the Rust API for working with these codes (including each of the namespace objects) and converting identifiers into and out of them.
The LOC Work and Instance sources are not currently used; they are intended for future use when we are able to import BIBFRAME data from the Library of Congress.