freddie

command module
v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 28, 2022 License: Apache-2.0 Imports: 10 Imported by: 0

README

package freddie

Go Report Card godoc

This package imports the loan-level historic data into ClickHouse. A single table is built. Time-varying fields are held in nested arrays in this table.

The package performs QA on the data as well as adding a handful of extra fields:

- vintage (e.g. 2010Q2)
- standard - Y/N field, Y = standard process loan
- loan age based on first pay date
- numeric dq field
- reo flag
- property value at origination
- file names from which the loan was loaded
- QA results. There are three sets of fields:
      - The nested table qa that has two arrays:
            - field.  The name of a field that has validation issues.
            - cntFail. The number of months for which this field failed qa.  For static fields, this value will be 1.
       - allFail.  An array of field names which failed for qa.  For monthly fields, this means the field failed for all months.

The command-line parameters are:

-host   
    ClickHouse IP address. Default value: 127.0.0.1
-user <user>
    ClickHouse user
-password <password>
    ClickHouse password for user. Default value: default
-table <db.table>
   ClickHouse table in which to insert the data. Default value: <none>
-create <Y|N>
    if Y, then the table is created/reset. Default value: Y
-dir <path>
    directory with Freddie Mac text files
-tmp <db>
    ClickHouse database to use for temporary tables
- concur <num>
    # of concurrent processes to use in loading monthly files. Default value: 1
-memory <numb>
    max memory usage by ClickHouse.  Default: 40000000000.
-groupby <num> 
    max_bytes_before_external_groupby ClickHouse parameter. Default: 20000000000.

Since the standard and non-standard data provided by Freddie Mac have the same format, both sets can be imported by this code either as a single table or two tables. To create a single table, run the app with

-create Y

for the first data source (e.g. standard) and

-create N

for the second data source.

A "DESCRIBE" of the table created by this package is yeidls:

img.png

The data is available here.

Documentation

Overview

package freddie. This package imports the loan-level residential mortgage data provided by Freddie Mac into ClickHouse. The data is available here: https://www.freddiemac.com/research/datasets/sf-loanlevel-dataset.

The final result is a single data with nested arrays for time-varying fields. Features:

  • The data is subject to QA. The results are presented as two string fields in a KeyVal format.
  • A "DESCRIBE" of the output table provides info on each field
  • New fields created are:
  • vintage (e.g. 2010Q2)
  • standard - Y/N flag, Y=standard process loan
  • loan age based on first pay date
  • numeric dq field
  • reo flag
  • property value at origination
  • file names from which the loan was loaded
  • QA results. There are three sets of fields:
  • The nested table qa that has two arrays:
  • field. The name of a field that has validation issues.
  • cntFail. The number of months for which this field failed qa. For static fields, this value will be 1.
  • allFail. An array of field names which failed for qa. For monthly fields, this means the field failed for all months.

command-line parameters:

-host  ClickHouse IP address. Default: 127.0.0.1.
-user  ClickHouse user. Default: default
-password ClickHouse password for user. Default: <empty>.
-table ClickHouse table in which to insert the data.
-create if Y, then the table is created/reset. Default: Y.
-dir directory with Freddie Mac text files.
-tmp ClickHouse database to use for temporary tables.
-concur # of concurrent processes to use in loading monthly files. Default: 1.
-memory max memory usage by ClickHouse.  Default: 40000000000.
-groupby max_bytes_before_external_groupby ClickHouse parameter. Default: 20000000000.

Since the standard and non-standard datasets have the same format, this utility can be used to create tables using either source. A combined table can be built by running the app twice pointing to the same -table. On the first run, set -create Y and set -create N for the second run.

Look at the example in the joined package for the DESCRIBE output of the table.

Note that the table produced by this package has slightly fewer loans than the check figures provided by Freddie. The difference seems to be that there are some loans in the static file that are not in the monthly file. With data through 2021Q3, this totals 1484 standard loans (HARP and non-HARP), and 207 non-standard loans.

Directories

Path Synopsis
Package joined joins the static and monthly tables created by the static and monthly packages
Package joined joins the static and monthly tables created by the static and monthly packages
Package monthly loads a single quarter of monthly data into ClickHouse
Package monthly loads a single quarter of monthly data into ClickHouse
Package static loads a single quarter of static data into ClickHouse.
Package static loads a single quarter of static data into ClickHouse.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL