Documentation ¶
Index ¶
- Constants
- Variables
- func AnnexBWorstSize(startCodeLen, rawLen int) int
- func DecodeAnnexB(encoded []byte) []byte
- func DecodeAnnexBSize(encoded []byte) int
- func DecodeClosestImageInPacketList(codec Codec, packets []*VideoPacket, targetTime time.Time, cache *FrameCache, ...) (*cimg.Image, time.Time, error)
- func DecodeFirstImageInPacketList(codec Codec, packets []*VideoPacket) (*cimg.Image, time.Time, error)
- func DecodeSinglePacketToImage(codec Codec, packet *VideoPacket) (*cimg.Image, error)
- func EncodeAnnexB(raw []byte, startCodeLen int, flags AnnexBEncodeFlags) []byte
- func EncodeAnnexBInto(raw []byte, startCodeLen int, flags AnnexBEncodeFlags, dst []byte) (encodedSize int, bufferSizeOK bool)
- func ExtractFrame(srcFilename string, atSecond float64, outputWidth int) ([]byte, error)
- func ExtractVideoDuration(srcFilename string) (time.Duration, error)
- func FirstLikelyAnnexBEncodedIndex(encoded []byte) int
- func IsVisualPacket(t h264.NALUType) bool
- func NALUStartCode(length int) []byte
- func NumPlanes(pixelFormat AVPixelFormat) int
- func ParseBinFilename(filename string) (packetNumber int, naluNumber int, timeNS int64)
- func ParseH264SPS(nalu []byte) (width, height int, err error)
- func RunAppCombinedOutput(app_name string, args []string) ([]byte, error)
- func TranscodeMediumQualitySeekable(srcFilename, dstFilename string) error
- func TranscodeSeekable(srcFilename, dstFilename string) error
- type AVPixelFormat
- type AnnexBEncodeFlags
- type Codec
- type Frame
- type FrameCache
- type MPGTSEncoder
- type NALU
- type PacketBuffer
- func (r *PacketBuffer) DecodeHeader() (width, height int, err error)
- func (r *PacketBuffer) DumpBin(dir string) error
- func (r *PacketBuffer) ExtractThumbnail() (*cimg.Image, error)
- func (r *PacketBuffer) FindClosestPacketWallPTS(wallPTS time.Time, keyframeOnly bool) int
- func (r *PacketBuffer) FindFirstIDR() int
- func (r *PacketBuffer) FirstNALUOfType(ofType h264.NALUType) *NALU
- func (r *PacketBuffer) HasIDR() bool
- func (r *PacketBuffer) IndexOfFirstNALUOfType(ofType h264.NALUType) (packetIdx int, indexInPacket int)
- func (r *PacketBuffer) ResetPTS()
- func (r *PacketBuffer) SaveToMP4(filename string) error
- func (r *PacketBuffer) SaveToMPEGTS(log logs.Log, output io.Writer) error
- type PayloadFormat
- type VideoDecoder
- func (d *VideoDecoder) Close()
- func (d *VideoDecoder) Decode(packet *VideoPacket) (*Frame, error)
- func (d *VideoDecoder) DecodeDeepRef(packet *VideoPacket) (*Frame, error)
- func (d *VideoDecoder) FrameTimeToDuration(pts int64) time.Duration
- func (d *VideoDecoder) Height() int
- func (d *VideoDecoder) NextFrame() (*Frame, error)
- func (d *VideoDecoder) NextFrameDeepRef() (*Frame, error)
- func (d *VideoDecoder) Width() int
- type VideoEncoder
- func (v *VideoEncoder) Close()
- func (v *VideoEncoder) WriteImage(pts time.Duration, data [][]uint8, stride []int) error
- func (v *VideoEncoder) WriteNALU(dts, pts time.Duration, nalu NALU) error
- func (v *VideoEncoder) WritePacket(dts, pts time.Duration, packet *VideoPacket) error
- func (v *VideoEncoder) WriteTrailer() error
- type VideoEncoderType
- type VideoPacket
- func (p *VideoPacket) Clone() *VideoPacket
- func (p *VideoPacket) EncodeToAnnexBPacket() []byte
- func (p *VideoPacket) FirstNALUOfType(t h264.NALUType) *NALU
- func (p *VideoPacket) HasIDR() bool
- func (p *VideoPacket) HasType(t h264.NALUType) bool
- func (p *VideoPacket) IsIFrame() bool
- func (p *VideoPacket) PayloadBytes() int
- func (p *VideoPacket) Summary() string
Constants ¶
const DebugVideoDecodeTimes = false
If true, report the decode FPS
const EnableEmulationPreventBytesEscaping = true
Topic: $ANNEXB-CONFUSION Here's the story: When we receive packets from Hikvision cameras, via github.com/bluenviron/gortsplib, the packets are supposedly NALUFormatRBSP, aka raw data bits, with no start codes, and no emulation prevention bytes. The codecs seem to want packets in SODB (aka AnnexB) encoding, so we dutifully encode the raw packets into AnnexB, with emulation prevention bytes added. HOWEVER, when we activate this code path, we get sporadic errors from ffmpeg, telling us that we've got bad frames. If we comment out the code that does the emulation prevention byte injection, then these errors go away. To be clear, we must inject the start codes. This is unambiguous. It's the emulation prevention bytes that cause errors. This confusion is the reason for this constant. At some point we'll hopefully learn more, and make better sense of this. Right now the culprit could be any one of these: 1. HikVision cameras 2. gortsplib 3. The way I'm using the h264 codec in ffmpeg 4. My SODB/Annex-B encoder 5. My understanding ------------------------ UPDATE WITH ANSWER ------------------------ I have come to the conclusion that my Hikvision cameras are sending data with emulation prevention bytes added to the byte stream, but without start codes. So this has led me to store two pieces of state with each NALU: 1. Does it have a start code? 2. How is the payload encoded? I initially thought that the presence of a start code should be synonymous with the presence of emulation prevention bytes, but I've learned that this is not the case.
Variables ¶
Functions ¶
func AnnexBWorstSize ¶
Return the worst case size of an Annex-B encoded packet, given the size of the raw packet (including a 3 byte start code).
func DecodeAnnexB ¶
Decode an Annex-B encoded packet into a Raw Byte Sequence Payload (RBSP). We assume that you're handling the 3 or 4 byte NALU prefix outside of this function.
func DecodeAnnexBSize ¶
Return the number of bytes needed to decode an Annex-B encoded packet. This function is for analysis of camera streams. In ordinary usage, we just call DecodeAnnexB().
func DecodeClosestImageInPacketList ¶
func DecodeClosestImageInPacketList(codec Codec, packets []*VideoPacket, targetTime time.Time, cache *FrameCache, videoCacheKey string) (*cimg.Image, time.Time, error)
Decode the list of packets, and return the decoded image who's presentation time is closest to targetTime. If targetTime is zero, then we return the first image coming out of the decoder. If cache is not nil, then we will insert/query the provided cache. videoCacheKey is the key for this video. We use {videoCacheKey-PTS} as the complete cache key.
func DecodeFirstImageInPacketList ¶
func DecodeFirstImageInPacketList(codec Codec, packets []*VideoPacket) (*cimg.Image, time.Time, error)
Decode the list of packets, and return the first image that successfully decodes
func DecodeSinglePacketToImage ¶
func DecodeSinglePacketToImage(codec Codec, packet *VideoPacket) (*cimg.Image, error)
Creates a decoder and attempts to decode a single IDR packet. This was built for extracting a thumbnail during a long recording. Obviously this is a bit expensive, because you're creating a decoder for just a single frame.
func EncodeAnnexB ¶
func EncodeAnnexB(raw []byte, startCodeLen int, flags AnnexBEncodeFlags) []byte
Encode an RBSP (Raw Byte Sequence Packet) into Annex-B format, optionally adding a 3 or 4 byte start code (00.00.01 or 00.00.00.01) to the beginning of the encoded byte stream. Also, we adds the "emulation prevention byte" (0x03) where necessary, if the relevant flag is set. If startCodeLen is zero, then we do not add a start code
func EncodeAnnexBInto ¶
func EncodeAnnexBInto(raw []byte, startCodeLen int, flags AnnexBEncodeFlags, dst []byte) (encodedSize int, bufferSizeOK bool)
Encode an RBSP (Raw Byte Sequence Packet) into Annex-B format, optionally adding a 3 byte start code (00.00.01) to the beginning of the encoded byte stream. This encoding adds the "emulation prevention byte" (0x03) where necessary.
func ExtractFrame ¶
Extract a single frame from a video file and return the JPEG bytes If outputWidth is zero, then we use the same width as the input video
func ExtractVideoDuration ¶
Extract the duration of a video file
func IsVisualPacket ¶
func NALUStartCode ¶
func NumPlanes ¶
func NumPlanes(pixelFormat AVPixelFormat) int
func ParseBinFilename ¶
This is just used for debugging and testing
func ParseH264SPS ¶
Parse a raw SPS NALU (not annex-b) On Rpi5, this takes 305ns for a 50 byte SPS packet, which is typical on my Hikvisions. On AMD Ryzen 9 5900X, this takes 94ns
func RunAppCombinedOutput ¶
app_name is an executable, such as "ffmpeg" or "ffprobe" args must not include the executable name as the first parameter Returns the string output from exec.Cmd's "CombinedOutput" method.
func TranscodeMediumQualitySeekable ¶
Transcode the high quality video stream into a slightly lower quality stream, with keyframes every 8 frames, and with noise reduction. This is for use on our training platform, where people need to be able to seek randomly inside a video.
func TranscodeSeekable ¶
Transcode a video to make it easy for a low powered mobile browser to seek to random video positions
Types ¶
type AVPixelFormat ¶
type AVPixelFormat int
Export some of the ffmpeg C pixel formats to Go
const ( AVPixelFormatYUV420P AVPixelFormat = C.AV_PIX_FMT_YUV420P AVPixelFormatRGB24 AVPixelFormat = C.AV_PIX_FMT_RGB24 )
type AnnexBEncodeFlags ¶
type AnnexBEncodeFlags int
Flags that control how EncodeAnnexB works
const ( AnnexBEncodeFlagNone AnnexBEncodeFlags = 0 // This is nonsensical - it is simply a memcpy AnnexBEncodeFlagAddEmulationPreventionBytes AnnexBEncodeFlags = 1 // Add emulation prevention bytes (0x03) where necessary )
type Frame ¶
type Frame struct { Image *accel.YUVImage // Image (might be a deep reference into ffmpeg memory) PTS int64 // Presentation time in native time units. Use VideoDecoder.FrameTimeToDuration() to convert to a time.Duration }
A decoded frame
type FrameCache ¶
type FrameCache struct { MaxMemory int // Maximum bytes of RAM to use MemoryUsed int // Current bytes of RAM used // contains filtered or unexported fields }
FrameCache is used to speed up the fetching of individual frames while a user is seeking around in a video. We cache YUV images.
func NewFrameCache ¶
func NewFrameCache(maxMemory int) *FrameCache
NewFrameCache creates a new FrameCache with the given maximum memory usage
func (*FrameCache) AddFrame ¶
func (f *FrameCache) AddFrame(key string, frame *accel.YUVImage)
Add a frame to the cache
type MPGTSEncoder ¶
type MPGTSEncoder struct {
// contains filtered or unexported fields
}
MPGTSEncoder allows to encode H264 NALUs into MPEG-TS.
func NewMPEGTSEncoder ¶
func NewMPEGTSEncoder(log logs.Log, output io.Writer, sps []byte, pps []byte) (*MPGTSEncoder, error)
NewMPEGTSEncoder allocates a mpegtsEncoder.
func (*MPGTSEncoder) Close ¶
func (e *MPGTSEncoder) Close() error
close closes all the mpegtsEncoder resources.
type NALU ¶
type NALU struct { PayloadIsAnnexB bool PayloadNoEscapes bool // True if PayloadIsAnnexB BUT we know that we have no "emulation prevention bytes", so we can avoid decoding them. Payload []byte }
Codec NALU
func WrapRawNALU ¶
Wrap a raw buffer in a NALU object. Do not clone memory, or add prefix bytes.
func (*NALU) AsAnnexB ¶
Return payload data, but make sure it's in AnnexB format, and has a start code of 00.00.01 or 00.00.00.01
func (*NALU) IsAnnexBWithStartCode ¶
Returns true if the NALU has a start code, and the payload is encoded with emulation prevention bytes
func (*NALU) IsRBSPWithNoStartCode ¶
Returns true if the NALU has no start code, and the payload is not encoded with emulation prevention bytes
func (*NALU) PayloadOnly ¶
Returns only the payload, without any start code
func (*NALU) StartCodeLen ¶
Returns length of start code Possible return values: 0: No start code 3: 00 00 01 4: 00 00 00 01
type PacketBuffer ¶
type PacketBuffer struct {
Packets []*VideoPacket
}
A list of packets, with some helper functions
func ExtractFsvPackets ¶
func ExtractFsvPackets(input []fsv.NALU) *PacketBuffer
Convert FSV packets to our VideoPacket format
func LoadBinDir ¶
func LoadBinDir(dir string) (*PacketBuffer, error)
Opposite of RawBuffer.DumpBin NOTE: We don't attempt to inject SPS and PPS into RawBuffer, but would be trivial for H264.. just look at first byte of payload... (67 and 68 for SPS and PPS)
func (*PacketBuffer) DecodeHeader ¶
func (r *PacketBuffer) DecodeHeader() (width, height int, err error)
Decode SPS and PPS to extract header information
func (*PacketBuffer) DumpBin ¶
func (r *PacketBuffer) DumpBin(dir string) error
Dump each NALU to a .raw file
func (*PacketBuffer) ExtractThumbnail ¶
func (r *PacketBuffer) ExtractThumbnail() (*cimg.Image, error)
Decode the center-most keyframe This is O(1), assuming no errors or funny business like no keyframes.
func (*PacketBuffer) FindClosestPacketWallPTS ¶
func (r *PacketBuffer) FindClosestPacketWallPTS(wallPTS time.Time, keyframeOnly bool) int
Find the packet with the WallPTS closest to the given time
func (*PacketBuffer) FindFirstIDR ¶
func (r *PacketBuffer) FindFirstIDR() int
Returns the index of the first keyframe in the buffer, or -1 if none found
func (*PacketBuffer) FirstNALUOfType ¶
func (r *PacketBuffer) FirstNALUOfType(ofType h264.NALUType) *NALU
Returns the first NALU of the given type, or nil if none found
func (*PacketBuffer) HasIDR ¶
func (r *PacketBuffer) HasIDR() bool
Returns true if we have at least one keyframe in the buffer
func (*PacketBuffer) IndexOfFirstNALUOfType ¶
func (r *PacketBuffer) IndexOfFirstNALUOfType(ofType h264.NALUType) (packetIdx int, indexInPacket int)
func (*PacketBuffer) ResetPTS ¶
func (r *PacketBuffer) ResetPTS()
Adjust all PTS values so that the first frame starts at time 0
func (*PacketBuffer) SaveToMP4 ¶
func (r *PacketBuffer) SaveToMP4(filename string) error
func (*PacketBuffer) SaveToMPEGTS ¶
Extract saved buffer into an MPEGTS stream
type PayloadFormat ¶
type PayloadFormat int8
PayloadState tells us the state of the payload, such as whether it has been escaped for Annex-B
const ( PayloadRawBytes PayloadFormat = iota // Not escaped (RBSP) PayloadAnnexB // Annex-B escaped (SODB) )
type VideoDecoder ¶
type VideoDecoder struct {
// contains filtered or unexported fields
}
VideoDecoder is a wrapper around ffmpeg, for decoding videos
func NewVideoFileDecoder ¶
func NewVideoFileDecoder(filename string) (*VideoDecoder, error)
Create a new decoder that will decode a file
func NewVideoStreamDecoder ¶
func NewVideoStreamDecoder(codec Codec) (*VideoDecoder, error)
Create a new decoder that you will feed with packets
func (*VideoDecoder) Close ¶
func (d *VideoDecoder) Close()
func (*VideoDecoder) Decode ¶
func (d *VideoDecoder) Decode(packet *VideoPacket) (*Frame, error)
Decode the packet and return a copy of the YUV image. This is used when decoding a stream (not a file).
func (*VideoDecoder) DecodeDeepRef ¶
func (d *VideoDecoder) DecodeDeepRef(packet *VideoPacket) (*Frame, error)
WARNING: The image returned is only valid while the decoder is still alive, and it will be clobbered by the subsequent DecodeDeepRef/Decode(). The pixels in the returned image are not a garbage-collected Go slice. They point directly into the libavcodec decode buffer. That's why the function name has the "DeepRef" suffix.
func (*VideoDecoder) FrameTimeToDuration ¶
func (d *VideoDecoder) FrameTimeToDuration(pts int64) time.Duration
Convert a native frame time to a time.Duration
func (*VideoDecoder) Height ¶
func (d *VideoDecoder) Height() int
func (*VideoDecoder) NextFrame ¶
func (d *VideoDecoder) NextFrame() (*Frame, error)
NextFrame reads the next frame from a file and returns a copy of the YUV image.
func (*VideoDecoder) NextFrameDeepRef ¶
func (d *VideoDecoder) NextFrameDeepRef() (*Frame, error)
NextFrameDeepRef will read the next frame from a file and return a deep reference into the libavcodec decoded image buffer. The next call to NextFrame/NextFrameDeepRef will invalidate that image.
func (*VideoDecoder) Width ¶
func (d *VideoDecoder) Width() int
type VideoEncoder ¶
type VideoEncoder struct { InputPixelFormat AVPixelFormat // contains filtered or unexported fields }
func NewVideoEncoder ¶
func NewVideoEncoder(codec, format, filename string, width, height int, pixelFormatIn, pixelFormatOut AVPixelFormat, encoderType VideoEncoderType, fps int) (*VideoEncoder, error)
NewVideoEncoder creates a new video encoder You must Close() a video encoder when you are done using it, otherwise you will leak ffmpeg objects
func (*VideoEncoder) Close ¶
func (v *VideoEncoder) Close()
func (*VideoEncoder) WriteImage ¶
Write an RGB (single plane) or YUV (3 planes) image to the encoder
func (*VideoEncoder) WriteNALU ¶
func (v *VideoEncoder) WriteNALU(dts, pts time.Duration, nalu NALU) error
func (*VideoEncoder) WritePacket ¶
func (v *VideoEncoder) WritePacket(dts, pts time.Duration, packet *VideoPacket) error
func (*VideoEncoder) WriteTrailer ¶
func (v *VideoEncoder) WriteTrailer() error
type VideoEncoderType ¶
type VideoEncoderType int
const ( VideoEncoderTypePackets VideoEncoderType = C.EncoderTypePackets // Sending pre-encoded packets/NALUs to the encoder VideoEncoderTypeImageFrames VideoEncoderType = C.EncoderTypeImageFrames // Sending image frames to the encoder )
type VideoPacket ¶
type VideoPacket struct { RawRecvID int64 // Arbitrary monotonically increasing ID of raw received. Used to detect dropped packets, or other issues like that. ValidRecvID int64 // Arbitrary monotonically increasing ID of useful decoded packets. Used to detect dropped packets, or other issues like that. RecvTime time.Time // Wall time when the packet was received. This is obviously subject to network jitter etc, so not a substitute for PTS H264NALUs []NALU H264PTS time.Duration WallPTS time.Time // Reference wall time combined with the received PTS. We consider this the ground truth/reality of when the packet was recorded. IsBacklog bool // a bit of a hack to inject this state here. maybe an integer counter would suffice? (eg nBacklogPackets) }
VideoPacket is what we store in our ring buffer
func ClonePacket ¶
func ClonePacket(nalusIn [][]byte, pts time.Duration, recvTime time.Time, wallPTS time.Time, isPayloadAnnexBEncoded bool) *VideoPacket
Clone a packet of NALUs and return the cloned packet NOTE: gortsplib re-uses buffers, which is why we copy the payloads. NOTE2: I think that after upgrading gortsplib in Jan 2024, it no longer re-uses buffers, so I should revisit the requirement of our deep clone here.
func (*VideoPacket) EncodeToAnnexBPacket ¶
func (p *VideoPacket) EncodeToAnnexBPacket() []byte
Encode all NALUs in the packet into AnnexB format (i.e. with 00,00,01 prefix bytes)
func (*VideoPacket) FirstNALUOfType ¶
func (p *VideoPacket) FirstNALUOfType(t h264.NALUType) *NALU
Returns the first NALU of the given type, or nil if none exists
func (*VideoPacket) HasIDR ¶
func (p *VideoPacket) HasIDR() bool
Returns true if this packet has a keyframe
func (*VideoPacket) HasType ¶
func (p *VideoPacket) HasType(t h264.NALUType) bool
Return true if this packet has a NALU of type t inside
func (*VideoPacket) IsIFrame ¶
func (p *VideoPacket) IsIFrame() bool
Return true if this packet has one NALU which is an intermediate frame
func (*VideoPacket) PayloadBytes ¶
func (p *VideoPacket) PayloadBytes() int
Returns the number of bytes of NALU data. If the NALUs have annex-b prefixes, then these are included in the size.
func (*VideoPacket) Summary ¶
func (p *VideoPacket) Summary() string