saos-tm.extractor.money

Module contains tools for extraction of money sums from raw text.

extract-max-money

(extract-max-money s)

Extract maximum sum of money in Polish zł from a given string s. The result format is analogous to extract-money, it contains a map with two keys:

  • :amount with bigdec number representing the amount of money
  • :text with the precise text referencing detected money sum

extract-money

(extract-money s)

Extracts all references to money sums in Polish zł, present in a given string s.

The result is a list of maps, each containing two keys:

  • :amount with bigdec number representing the amount of money or “ERROR”
  • :text with the precise text referencing detected money sum

Example:

(extract-money "To kosztowało 123 zł 33 gr")

[ {:amount 123.33M :text "123 zł 33" gr]

It works by finding token and parsing preceding number. If supports prefixes tys. and mln as well as suffix gr. If the algorithm has difficulties with parsing given text it returns “ERROR” in the :amount field.