fannie

command module
v1.0.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 7, 2023 License: Apache-2.0 Imports: 11 Imported by: 0

README

Package fannie loads the standard, HARP and non-standard (excluded) loans made available by Fannie Mae into a ClickHouse database.

The final result is a single table with nested arrays for time-varying fields. Features:

  • New fields created are:
    • vintage (e.g. 2010Q2)
    • standard - Y/N flag, Y=standard process loan
    • loan age based on first pay date
    • numeric dq field
    • property value at origination
    • harp - Y/N flag, Y=HARP loan.
    • file name from which the loan was loaded
    • QA results. There are three sets of fields:
      • The nested table qa that has two arrays: - field. The name of a field that has validation issues. - cntFail. The number of months for which this field failed qa. For static fields, this value will be 1.
      • allFail. An array of field names which failed for qa. For monthly fields, this means the field failed for all months.
  • A "DESCRIBE" of the output table provides info on each field.

The command-line parameters are:

-host  ClickHouse IP address. Default: 127.0.0.1.
-user  ClickHouse user. Default: default
-password ClickHouse password for user. Default: <empty>.
-table ClickHouse table in which to insert the data.
-maptable.  Clickhouse table that maps pre-HARP loan ids to HARP ids.  This table is both created and used by the package.
-create if Y, then the table is created/reset. Default: Y.
-dir directory with Fannie Mae text files.
-tmp ClickHouse database to use for temporary tables.
-concur # of concurrent processes to use in loading monthly files. Default: 1.
-memory max memory usage by ClickHouse.  Default: 40000000000.
-groupby max_bytes_before_external_groupby ClickHouse paramter. Default: 20000000000.

The non-standard loan files have four additional fields. This package recognizes whether the file is standard or non-standard.

A combined table can be built by running the app twice pointing to the same -table. On the first run, set

-create Y 

and set

 -create N 

for the second run.

Note: for this package to run correctly, the standard loans should be loaded first, so that the table that maps HARP loans to their corresponding pre-HARP loan is loaded and available. This table isn't needed after all the files are loaded.

A DESCRIBE of the final table produces:

img.png

The data is available at here.

Documentation

Overview

Package fannie loads the standard, HARP and non-standard (excluded) loans made available by Fannie Mae into a ClickHouse database.

The final result is a single table with nested arrays for time-varying fields. Key features of this package:

  • New fields created are:
  • vintage (e.g. 2010Q2)
  • standard - Y/N flag, Y=standard process loan
  • loan age based on first pay date
  • numeric dq field
  • property value at origination
  • harp - Y/N flag, Y=HARP loan.
  • file name from which the loan was loaded
  • QA results. There are three sets of fields:
  • The nested table qa that has two arrays:
  • field. The name of a field that has validation issues.
  • cntFail. The number of months for which this field failed qa. For static fields, this value will be 1.
  • allFail. An array of field names which failed for qa. For monthly fields, this means the field failed for all months.
  • A "DESCRIBE" of the output table provides info on each field.

The command-line parameters are:

-host  ClickHouse IP address. Default: 127.0.0.1.
-user  ClickHouse user. Default: default
-password ClickHouse password for user. Default: <empty>.
-table ClickHouse table in which to insert the data.
-maptable.  Clickhouse table that maps pre-HARP loan ids to HARP ids.  This table is both created and used by the package.
-create if Y, then the table is created/reset. Default: Y.
-dir directory with Fannie Mae text files.
-tmp ClickHouse database to use for temporary tables.
-concur # of concurrent processes to use in loading monthly files. Default: 1.
-memory max memory usage by ClickHouse.  Default: 40000000000.
-groupby max_bytes_before_external_groupby ClickHouse paramter. Default: 20000000000.

The non-standard loans have four additional fields. This package recognizes whether the file is standard or not. A combined table can be built by running the app twice pointing to the same -table. On the first run, set -create Y and set -create N for the second run.

Note: for this package to run correctly, the standard loans should be loaded first, so that the table that maps HARP loans to their corresponding pre-HARP loan is loaded and available. This table isn't needed after all the files are loaded.

See the example under package collapse for the structure of the table.

The data is available at https://datadynamics.fanniemae.com/data-dynamics/#/reportMenu;category=HP.

Directories

Path Synopsis
Package collapse reduces the raw table that has one entry per loan per month to a table that has one entry per loan.
Package collapse reduces the raw table that has one entry per loan per month to a table that has one entry per loan.
Package raw reads in the raw data -- either the standard or non-standard files.
Package raw reads in the raw data -- either the standard or non-standard files.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL