hnsync

command module
v0.0.0-...-18b5d25 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 22, 2024 License: GPL-3.0 Imports: 17 Imported by: 0

README

hnsync

hnsync is a Go application that syncs Hacker News items to a local SQLite database. It fetches HN items (stories and comments) up to the maxitem from the HN API and then exits. Recent items are also periodically refreshed during the run. For near real-time updates, simply restart hnsync upon exit.

Prerequisites

  • Go: Version 1.22 or later
  • SQLite: Version 3

Usage

# Run with default settings
go run github.com/larose/hnsync@latest

# Customize settings
go run github.com/larose/hnsync@latest -workers 100 -db custom.db

# Enable profiling for performance monitoring
go run github.com/larose/hnsync@latest -profile

Flags

  • -workers (default: 200): Number of concurrent worker goroutines for syncing.
  • -db (default: hn.db): Filename for the SQLite database.
  • -profile: Enables a pprof server on :6060 for performance analysis.

Schema

The synced data is stored in a table named hn_items, which contains the following key columns:

  • id: Unique Hacker News item identifier (INTEGER).
  • data: Raw JSON data from the Hacker News API (TEXT).
  • _last_synced_at: Timestamp of the last successful sync (TEXT). Useful for tracking changes.

Additional internal columns prefixed with _ are used for syncing operations and are not intended for direct use.

Example Queries

Get one item:

sqlite> SELECT id, data FROM hn_items LIMIT 1;
id  data
--  ------------------------------------------------------------
1   {"by":"pg","descendants":15,"id":1,"kids":[15,234509,487171,
    82729],"score":57,"time":1160418111,"title":"Y Combinator","
    type":"story","url":"http://ycombinator.com"}

Retrieve items updated in the last hour:

sqlite> SELECT id, data FROM hn_items WHERE _last_synced_at > datetime('now', '-1 hour');
...
Note on Indexing

By default, there are no indexes on the _last_synced_at column. For better query performance, especially when filtering by this column, consider adding an index:

CREATE INDEX hn_items_last_synced_at ON hn_items (_last_synced_at);

License

This project is licensed under the GNU General Public License v3.0. Refer to the LICENSE file for more details.

Copyright (C) 2024 Mathieu Larose

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL