Module names

Source
Expand description

Extract and normalize author names.

Names in the book data — both in author records and in their references in book records — come in a variety of formats. This module is responsible for expanding and normalizing those name formats to improve data linkability. Some records also include a year or date range for the author’s lifetime. We normalize names as follows:

  • If the name is “Last, First”, we emit both “Last, First” and “First Last” variants.
  • If the name has a year, we emit each variant both with and without the year.
  • Leading and trailing junk is cleaned

This maximizes our ability to match records across sources recording names in different formats.

name_variants is the primary entry point for using this module. The clean_name function provides cleanup utilities without parsing, for emitting names from book records.

Re-exports§

pub use types::NameError;
pub use parse::parse_name_entry;

Modules§

parse 🔒
PEG parser for name variants.
types 🔒

Functions§

clean_name
Clean up a name from unnecessary special characters.
name_variants
Extract all variants from a name.
preclean 🔒
Pre-clean a string without copying.