stackoverflow

package module
v0.0.0-...-2a5d8a2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 13, 2015 License: MIT Imports: 8 Imported by: 1

README

stackoverflow

Go library for parsing StackOverflow XML data dump files.

Installation: go get -u github.com/kjk/stackoverflow

See cmd/stats/main.go and cmd/tocsv/main.go for examples of how to use this library.

Docs that were helpful decoding the format:

Documentation

Index

Constants

View Source
const (
	HistoryInitialTitle                 = 1
	HistoryInitialBody                  = 2
	HistoryInitialTags                  = 3
	HistoryEditTitle                    = 4
	HistoryEditBody                     = 5
	HistoyrEditTags                     = 6
	HistoryRollbackTitle                = 7
	HistoryRollbackBody                 = 8
	HistoryRollbackTags                 = 9
	HistoryPostClosed                   = 10
	HistoryPostReopened                 = 11
	HistoryPostDeleted                  = 12
	HistoryPostUndeleted                = 13
	HistoryPostLocked                   = 14
	HistoryPostUnlocked                 = 15
	HistoryCommunityOwned               = 16
	HistoryPostMigrated                 = 17
	HistoryQuestionMerged               = 18
	HistoryQuestionProtected            = 19
	HistoryQuestionUnprotected          = 20
	HistoryPostDisassociated            = 21
	HistoryQuestionUnmerged             = 22
	HistorySuggestedEditApplied         = 24
	HistoryPostTweeted                  = 25
	HistoryCommentDiscussionMovedToChat = 26
	HistoryPostNoticeAdded              = 33
	HistoryPostNoticeRemoved            = 34
	HistoryPostMigratedAway             = 35 // replaces id 17
	HistoryPostMigratedHere             = 36 // replaces id 17
	HistoryPostMergeSource              = 37
	HistoryPostMergeDestination         = 38
)

http://meta.stackexchange.com/questions/2677/database-schema-documentation-for-the-public-data-dump-and-sede?rq=1

View Source
const (
	LinkTypeLinked    = 1
	LinkTypeDuplicate = 2
)

http://meta.stackexchange.com/questions/2677/database-schema-documentation-for-the-public-data-dump-and-sede?rq=1

View Source
const (
	PostQuestion            = 1
	PostAnswer              = 2
	PostOrphanedTagWiki     = 3
	PostTagWikiExcerpt      = 4
	PostTagWiki             = 5
	PostModeratorNomination = 6
	PostWikiPlaceholder     = 7
	PostPrivilegeWiki       = 8
)

http://meta.stackexchange.com/questions/2677/database-schema-documentation-for-the-public-data-dump-and-sede?rq=1

View Source
const (
	VoteAcceptedByOriginator   = 1
	VoteUpMod                  = 2
	VoteDownMod                = 3
	VoteOffensive              = 4
	VoteFavorite               = 5
	VoteClose                  = 6
	VoteReopen                 = 7
	VoteBountyStart            = 8
	VoteBountyClose            = 9
	VoteDeletion               = 10
	VoteUndeletion             = 11
	VoteSpam                   = 12
	VoteModeratorReview        = 15
	VoteApproveEditoSuggestion = 16
)

http://blog.stackoverflow.com/2009/06/stack-overflow-creative-commons-data-dump/#comment-24147 http://meta.stackexchange.com/questions/2677/database-schema-documentation-for-the-public-data-dump-and-sede?rq=1 http://data.stackexchange.com/stackoverflow/query/102390/vote-types

View Source
const (
	// TimeFormat is how time is formatted in .xml files
	TimeFormat = "2006-01-02T15:04:05.999999999"
)

Variables

This section is empty.

Functions

This section is empty.

Types

type Badge

type Badge struct {
	ID     int
	UserID int
	Name   string
	Date   time.Time
}

Badge tells which badge a given user has

type Comment

type Comment struct {
	ID              int
	PostID          int
	Score           int
	Text            string
	CreationDate    time.Time
	UserID          int
	UserDisplayName string
}

Comment describes a comment

type Post

type Post struct {
	ID                    int
	PostTypeID            int
	ParentID              int // for PostAnswer
	AcceptedAnswerID      int
	CreationDate          time.Time
	Score                 int
	ViewCount             int
	Body                  string
	OwnerUserID           int
	OwnerDisplayName      string
	LastEditorUserID      int
	LastEditorDisplayName string
	LastEditDate          time.Time
	LastActivitityDate    time.Time
	Title                 string
	Tags                  []string
	AnswerCount           int
	CommentCount          int
	FavoriteCount         int
	CommunityOwnedDate    time.Time
	ClosedDate            time.Time
}

Post describes a post

type PostHistory

type PostHistory struct {
	ID                int
	PostHistoryTypeID int
	PostID            int
	RevisionGUID      string
	CreationDate      time.Time
	UserID            int
	UserDisplayName   string
	// if PostHistoryTypeID is 10, 11, 12, 13, 14, 15, this is JSON
	// with users who voted
	Text string
	// if PostHistoryTypeID is HistoryInitialTags or HistoyrEditTags
	// or HistoryRollbackTags, this is a decoded version of tags
	Tags    []string
	Comment string
}

PostHistory describes history of a post

type PostLink struct {
	ID            int
	CreationDate  time.Time
	PostID        int
	RelatedPostID int
	LinkTypeID    int
}

PostLink describes links in a post

type Reader

type Reader struct {
	User        User
	Post        Post
	Comment     Comment
	Badge       Badge
	Tag         Tag
	PostHistory PostHistory
	PostLink    PostLink
	Vote        Vote
	// contains filtered or unexported fields
}

Reader is for iteratively reading records from xml file

func NewBadgesReader

func NewBadgesReader(r io.Reader) (*Reader, error)

NewBadgesReader returns a new reader for Badges.xml file

func NewBadgesReaderFromFile

func NewBadgesReaderFromFile(path string) (*Reader, error)

NewBadgesReaderFromFile returns a new reader for Badges.xml file

func NewCommentsReader

func NewCommentsReader(r io.Reader) (*Reader, error)

NewCommentsReader returns a new reader for Comments.xml file

func NewCommentsReaderFromFile

func NewCommentsReaderFromFile(path string) (*Reader, error)

NewCommentsReaderFromFile returns a new reader for Comments.xml file

func NewPostHistoryReader

func NewPostHistoryReader(r io.Reader) (*Reader, error)

NewPostHistoryReader returns a new reader for PostHistory.xml file

func NewPostHistoryReaderFromFile

func NewPostHistoryReaderFromFile(path string) (*Reader, error)

NewPostHistoryReaderFromFile returns a new reader for PostHistory.xml file

func NewPostLinksReader

func NewPostLinksReader(r io.Reader) (*Reader, error)

NewPostLinksReader returns a new reader for PostLinks.xml file

func NewPostLinksReaderFromFile

func NewPostLinksReaderFromFile(path string) (*Reader, error)

NewPostLinksReaderFromFile returns a new reader for PostLinks.xml file

func NewPostsReader

func NewPostsReader(r io.Reader) (*Reader, error)

NewPostsReader returns a new reader for Posts.xml file

func NewPostsReaderFromFile

func NewPostsReaderFromFile(path string) (*Reader, error)

NewPostsReaderFromFile returns a new reader for Posts.xml file

func NewTagsReader

func NewTagsReader(r io.Reader) (*Reader, error)

NewTagsReader returns a new reader for Comments.xml file

func NewTagsReaderFromFile

func NewTagsReaderFromFile(path string) (*Reader, error)

NewTagsReaderFromFile returns a new reader for Comments.xml file

func NewUsersReader

func NewUsersReader(r io.Reader) (*Reader, error)

NewUsersReader returns a new reader for Users.xml file

func NewUsersReaderFromFile

func NewUsersReaderFromFile(path string) (*Reader, error)

NewUsersReaderFromFile returns a new reader for Users.xml file

func NewVotesReader

func NewVotesReader(r io.Reader) (*Reader, error)

NewVotesReader returns a new reader for Votes.xml file

func NewVotesReaderFromFile

func NewVotesReaderFromFile(path string) (*Reader, error)

NewVotesReaderFromFile returns a new reader for Votes.xml file

func (*Reader) Close

func (r *Reader) Close()

Close closes a reader

func (*Reader) Err

func (r *Reader) Err() error

Err returns potential error

func (*Reader) Next

func (r *Reader) Next() bool

Next advances to next User record. Returns false on end or

type Tag

type Tag struct {
	ID            int
	TagName       string
	Count         int
	ExcerptPostID int
	WikiPostID    int
}

Tag describes a tag

type User

type User struct {
	ID              int
	Reputation      int
	CreationDate    time.Time
	DisplayName     string
	LastAccessDate  time.Time
	WebsiteURL      string
	Location        string
	AboutMe         string
	Views           int
	UpVotes         int
	DownVotes       int
	Age             int
	AccountID       int
	ProfileImageURL string
}

User describes a user

type Vote

type Vote struct {
	ID         int
	PostID     int
	VoteTypeID int
	// only present if VoteTypeID is 5 or 8
	UserID int
	// only present if VoteTypeID is 8 or 9
	BountyAmount int
	CreationDate time.Time
}

Vote describes a vote

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL