go-redis-search-files

command module

v0.0.0-...-068aa71 Latest Latest Go to latest Published: Oct 6, 2023 License: MIT Imports: 17 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/GHLabidi/go-redis-search-files

Links

Open Source Insights

README ¶

Go Redis Search Files

This is a work in progress

Description

A tool to search for a word in a large number of files either sequentially or by using goroutines. It is written in Go and uses Redis to cache the search results.

Aim

The goal is to eventually turn this into a distributed system that can be used to search for a word in a large number of files across multiple servers. As of now, it is a single server that uses Redis to cache the search results and can search for a word in a large number of files either sequentially or by using goroutines.

API

Search

URL
- /search
Method
- GET
URL Params
- Required
  - word=[string]
    - The word to search for
- Optional
  - forceSearch=[bool] (default: false)
    - If true, will not use the cached results
  - searchMode=[string] (default: simple)
    - The search mode to use (simple, concurrent). Can be extended to add more search modes
    - Options:
      - simple
        
        Goes through each file and searches for the word
      - concurrent
        
        Uses goroutines to search for the word in each file.
  - concurrentThreads=[int] (default: 100)
    - The number of threads to use when searching in concurrent mode.
Success Response
- Code: 200
- Content:
  - ```
       {
            "Word": "thee",
            "Count": 3,
            "Files": [
                "data/subfolder1/sample_text1.txt",
                "data/subfolder2/sample_xml.xml",
                "data/subfolder2/subsubfolder/sample_python.py",
            ],
            "QueryDuration": 283322650,
            "SearchMode": "concurrent"
        }
```
- Response Description
  - Word: The word that was searched for
  - Count: The number of times the word was found
  - Files: The files that the word was found in
  - QueryDuration: The time it took to search for the word (in nanoseconds)
  - SearchMode: The search mode that was used. Can be simple, concurrent, redis. Will be renamed to ResultsSource.
Sample Calls
- curl -X GET "http://localhost:8080/search?word=The"
- curl -X GET "http://localhost:8080/search?word=The&forceSearch=true"
- curl -X GET "http://localhost:8080/search?word=The&searchMode=concurrent"
- curl -X GET "http://localhost:8080/search?word=The&searchMode=concurrent&concurrentThreads=100"
- curl -X GET "http://localhost:8080/search?word=The&searchMode=concurrent&concurrentThreads=100&forceSearch=true"

Health

URL
- /health
Method
- GET
Success Response
- Code: 200
- Content:
  - ```
    OK
```
- Response Description
  - Returns OK if the server is running. Will be extended to return more information about the server in json format.
Sample Call
- curl -X GET "http://localhost:8080/health"

System Specs

URL
- /system-specs
Method
- GET

Success Response

Code: 200

Content:

  {
  "CPUName": "AMD Ryzen 7 3700X 8-Core Processor",
  "CPUCores": 8,
  "RAMSize": "7.13 GB",
  "IsDocker": false
  }

Help

URL
Method
- GET
Success Response
- Code: 200
- Content:
  - ```
    Text response
```

How to run

Prerequisites

Go 1.14 or higher (needed for the go modules)
Redis Server

Steps

Clone the repository
Place all your files in the data directory
Copy and edit the .env.example file to .env
- Set the REDIS_HOST to the host of your redis server
- Set the REDIS_PORT to the port of your redis server
- Set the REDIS_DB to the db number of your redis server
- Set the REDIS_PASSWORD to the password of your redis server
- Set the PORT on which the server will run.
Two options to run the go server
- Option 1:
  - Run go run main.go in the root directory
- Option 2:
  1. Run go build -o app in the root directory
  2. Run ./app

Known Issues

Handling multiple concurrent requests

Issue: If n clients lookup a word that is not cached at the same time, the program will launch n search operations. This can significantly slow down the system.
Potential Solutions:
- Keeping track of the words being searched: We can use a list to keep track of the words currently being searched and use a mutex to make the list thread safe. If the word is in the list (meaning it is being searched for) then wait for the result to be passed through a channel instead of searching for the word again.
- Limit the total number of possible concurrent requests: Use some mechanism to limit the maximum concurrent requests that can be served at once, if the maximum is reached/ exceeded, set a timeout for the request

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

main.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL