Human Name Parsing in PHP
Parsing human names are not exactly easy, but they can be done. Keith Beckman’s nameparse.php
is an excellent PHP library for doing this.
nameparse.php
can recognize names in “[title]first[middles]last[,][suffix]” and “last,first[middles][,][suffix]” forms, which, when you think about it, cover most if not all well-formed name input formats.nameparse.php
handles last names of arbitrary complexity, such as “bin Laden”, “van der Vort”, and “Garcia y Vega”, as well as middle names of arbitrary size and complexity, differentiating between most last names and the first or middle names or initials preceding them.An example of names correctly parse by nameparse.php:
- Doe, John. A. Kenneth III
- Velasquez y Garcia, Dr. Juan, Jr.
- Dr. Juan Q. Xavier de la Vega, Jr.
To use, simple
include()
orrequire()
nameparse.php
and callparse_name($string)
on any name.parse_name()
returns an associative array of all name segments found of “title”,”first”,”middle”,”last”, and “suffix”. Do note that no spelling, capitalization, or punctuation of titles, prefixes, or suffixes is normalized. That is, every token remains as entered:nameparse.php
is a semantic parser only. If you want orthographic or other normalization, you’ll have to postprocess the output. However, since the name is now semantically parsed, such postprocessing is (for applications which require it) simple.
print_r(parse_name('Velasquez y Garcia, Dr. Juan Q. Xavier III'));
yields . . .
Array ( [title] => Dr. [first] => Juan [middle] => Q. Xavier [suffix] => III [last] => Velasquez y Garcia )