parseval: A pythonic data validator

parseval is a data validation tool for python. Following are the available parsers:

FieldParser:

Signature: FieldParser(start: int = 0, end: int = 0, quoted: int = 0, enforce_type: bool = True)

Parameters:

Available APIs:

Signature: not_null(default_value: any = None)

Parameters:

  • default_value: Default value for a column which should be not null

Signature: value_set(values: typing.List, nullable: bool = True)

Parameters:

  • values: Set of valid values for this column
  • nullable: If set to True then empty string and None will be treated as valid value, along with the provided value list

Signature: max_value(value: any)

Parameters:

  • values: Maximum allowed value for the column

Signature: min_value(value: any)

Parameters:

  • values: Minimum allowed value for the column

Signature: range(lower_bound: any, upper_bound: any)

Parameters:

  • lower_bound: Minimum allowed value for the column
  • upper_bound: Maximum allowed value for the column

Signature: add_func(f: function)

Parameters:

  • f: Custom function to be added to the parser



StringParser:

Signature: StringParser(start: int = 0, end: int = 0, quoted: int = 0, enforce_type: bool = True)

Parameters:

Available APIs:

Signature: not_null(default_value: str = None, allow_white_space: bool = False)

Parameters:

  • default_value: Default value for a column which should be not null
  • allow_white_space: If set to True, whitespaces will not be treated as Null value.

Signature: value_set(values: typing.List[str], nullable: bool = True)

Parameters:

  • values: Set of valid values for this column
  • nullable: If set to True then empty string and None will be treated as valid value, along with the provided value list

Signature: max_value(value: str)

Parameters:

  • values: Maximum allowed value for the column

Signature: min_value(value: str)

Parameters:

  • values: Minimum allowed value for the column

Signature: regex_match(pattern: str, nullable=True)

Parameters:

  • pattern: Patter to match with the data
  • nullable: If set to True then empty string and None will be treated as valid value, along with the values that matches provided pattern

Signature: change_case(case_type: str = 'S')

Parameters:

  • case_type: Target case: {'U'/'u': UPPERCASE, 'L'/'l': lowercase, 'S'/'s': Sentence Case}

Signature: range(lower_bound: str, upper_bound: str)

Parameters:

  • lower_bound: Minimum allowed value for the column
  • upper_bound: Maximum allowed value for the column

Signature: add_func(f: function)

Parameters:

  • f: Custom function to be added to the parser



FloatParser:

Signature: FloatParser(start: int = 0, end: int = 0, quoted: int = 0, enforce_type: bool = True)

Parameters:

Available APIs:

Signature: not_null(default_value: float = None)

Parameters:

  • default_value: Default value for a column which should be not null

Signature: value_set(values: typing.List[float], nullable: bool = True)

Parameters:

  • values: Set of valid values for this column
  • nullable: If set to True then empty string and None will be treated as valid value, along with the provided value list

Signature: max_value(value: float)

Parameters:

  • values: Maximum allowed value for the column

Signature: min_value(value: float)

Parameters:

  • values: Minimum allowed value for the column

Signature: range(lower_bound: float, upper_bound: float)

Parameters:

  • lower_bound: Minimum allowed value for the column
  • upper_bound: Maximum allowed value for the column

Signature: add_func(f: function)

Parameters:

  • f: Custom function to be added to the parser



IntegerParser:

Signature: IntegerParser(start: int = 0, end: int = 0, quoted: int = 0, enforce_type: bool = True)

Parameters:

Available APIs:

Signature: not_null(default_value: int = None)

Parameters:

  • default_value: Default value for a column which should be not null

Signature: value_set(values: typing.List[int], nullable: bool = True)

Parameters:

  • values: Set of valid values for this column
  • nullable: If set to True then empty string and None will be treated as valid value, along with the provided value list

Signature: max_value(value: int)

Parameters:

  • values: Maximum allowed value for the column

Signature: min_value(value: int)

Parameters:

  • values: Minimum allowed value for the column

Signature: range(lower_bound: int, upper_bound: int)

Parameters:

  • lower_bound: Minimum allowed value for the column
  • upper_bound: Maximum allowed value for the column

Signature: add_func(f: function)

Parameters:

  • f: Custom function to be added to the parser



BooleanParser:

Signature: BooleanParser(start: int = 0, end: int = 0, quoted: int = 0, enforce_type: bool = True)

Parameters:


DatetimeParser:

Signature: DatetimeParser(start: int = 0, end: int = 0, formats: typing.List =['%Y%m%d', '%Y%m%d%H%M%S'], quoted: int = 0, enforce_type: bool = True)

Parameters:

Available APIs:

Signature: not_null(default_value: typing.Union[str, datetime.datetime] = None, format: str = '%Y%m%d%H%M%S')

Parameters:

  • default_value: Default value for a column which should be not null
  • format: Provided default value format, if a datetime object is provided as default value, then this parameter has no effect.

Signature: value_set(values: typing.List[typing.Union[str, datetime.datetime]], format='%Y%m%d%H%M%S', nullable: bool = True)

Parameters:

  • values: Set of valid values for this column
  • format: Provided allowed value's format, if a datetime object is provided as allowed value, then this parameter has no effect.
  • nullable: If set to True then empty string and None will be treated as valid value, along with the provided value list

Signature: max_value(value: typing.Union[str, datetime.datetime], format: str = '%Y%m%d%H%M%S')

Parameters:

  • values: Maximum allowed value for the column
  • format: Provided allowed value's format, if a datetime object is provided as allowed value, then this parameter has no effect.

Signature: min_value(value: typing.Union[str, datetime.datetime], format: str = '%Y%m%d%H%M%S')

Parameters:

  • values: Minimum allowed value for the column
  • format: Provided allowed value's format, if a datetime object is provided as allowed value, then this parameter has no effect.

Signature: range(lower_bound: typing.Union[str, datetime.datetime], upper_bound: typing.Union[str, datetime.datetime], format='%Y%m%d%H%M%S')

Parameters:

  • lower_bound: Minimum allowed value for the column
  • upper_bound: Maximum allowed value for the column
  • format: Provided allowed value's format, if a datetime object is provided as allowed value, then this parameter has no effect.

Signature: add_func(f: function)

Parameters:

  • f: Custom function to be added to the parser



ConstantParser:

Signature: ConstantParser(value)

Parameters:



Parser:

Signature: Parser(schema: typing.List[typing.Tuple] = [], input_row_format: str = "delimited", input_row_sep: str = "|", parsed_row_format: str = "delimited", parsed_row_sep: str = None, stop_on_error: int = 0)

Parameters:

Available APIs:

Signature: parse(data: typing.Union[typing.List[typing.Union[str, typing.Dict]], typing.TextIO])

Parameters:

  • data: Input data set




For detailed installation and usage information please check https://github.com/saumalya75/parseval here.

For any further queries reach out to saumalya75@gmail.com or http://linkedin.com/in/saumalya-sarkar-b3712817b .