Null Pointer
  • About
  • Projects
  • Research
Sign in Subscribe

Code

Tutorials, snippets, discoveries and general thoughts on the business of software engineering.
Code

Python Pipes for Text Tokenization

Tokenizing text is something that I tend to do pretty often, typically as the beginning of an NLP workflow. The normal workflow goes something like this: 1. Build a generator to stream in some sentences / documents / whatever. 2. Perform some set of transformations on the text. 3. Feed the result
27 Jul 2015 8 min read
Page 1 of 1
Null Pointer © 2025
Powered by Ghost