Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
View Source
var ErrBinaryFile = fmt.Errorf("file appears to be binary")
ErrBinaryFile is returned when a file appears to contain binary content.
Functions ¶
func SupportedExtensions ¶
func SupportedExtensions() []string
SupportedExtensions returns all supported file extensions.
Types ¶
type DOCXLoader ¶
type DOCXLoader struct{}
func (*DOCXLoader) Load ¶
func (l *DOCXLoader) Load(path string) (*RawDocument, error)
func (*DOCXLoader) Supports ¶
func (l *DOCXLoader) Supports(ext string) bool
type DocumentLoader ¶
type DocumentLoader interface {
Load(path string) (*RawDocument, error)
Supports(ext string) bool
}
DocumentLoader can load a file and return its text.
type MarkdownLoader ¶
type MarkdownLoader struct{}
func (*MarkdownLoader) Load ¶
func (l *MarkdownLoader) Load(path string) (*RawDocument, error)
func (*MarkdownLoader) Supports ¶
func (l *MarkdownLoader) Supports(ext string) bool
type RawDocument ¶
type RawDocument struct {
Path string
Title string
DocType string
Content string // full extracted text
}
RawDocument is the output of loading a file.
func Load ¶
func Load(path string) (*RawDocument, error)
Load dispatches to the correct loader by file extension. Returns ErrBinaryFile (non-fatal) if the file looks like a binary.
type WebLoader ¶
type WebLoader struct {
// contains filtered or unexported fields
}
WebLoader fetches a single URL and extracts its text content.
func NewWebLoader ¶
func NewWebLoader() *WebLoader
Click to show internal directories.
Click to hide internal directories.