README ¶
Timer framework usage
Introduction
Timer framework is an internal framework for TiDB to run tasks periodically. You can use it if you have below requirements:
- Run a task periodically and the scheduling policy should fit some specified semantics. For example, run a task every 5 minutes, or run a task every day at 00:00:00.
- Need to run the task in a distributed environment. For example, you have 3 TiDB servers, and you want to run the task on only one of them. And you also want to do some recovery work when the TiDB process restarts.
- Need to read the persisted schedule runtime information of the timer to do some work somewhere like monitoring.
- Your trigger action needs to meet some message delivery semantics such as at least once, at most once, or exactly once.
What timer framework is not suitable for you:
- Some work that the frequency is too high. For example, you want to run a task every 1 second. Though you can set the scheduling policy with an interval of "1s", it is not efficient because every time it triggers, the framework will make some calls to the store. In this case, you should use a timer of Golang in your own code.
- The timer framework only helps you to schedule a "trigger" action but not to schedule some heavy jobs across the nodes. That means the trigger action should be light-weighted and can be finished in a short time. If you want to schedule some heavy jobs, you should use some other distributed job scheduler like
disttask
in another package.
Usage
Timer client operations
Create a timer client
A client must be created from a store. The store is used to persist the runtime information of the timer. Currently, we use TiDB system table to store the information in the production environment. But if you want to do some simple tests in your own test case, you can use a memory store, for example:
package main
import (
"fmt"
"github.com/pingcap/tidb/timer/api"
)
func main() {
store := api.NewMemoryTimerStore()
client := api.NewDefaultTimerClient(store)
fmt.Println(client.GetDefaultNamespace())
}
The method client.GetDefaultNamespace()
returns the current namespace of the client. It is used to distinguish different tenants in the future, but now it is always default
.
You can check the method tablestore.NewTableTimerStore
to see how to create a store from a TiDB system table.
Create a timer with the client
If you want to do something every 1 hour, you can create a timer like this:
timer, err := client.CreateTimer(ctx, api.TimerSpec{
Key: "/your/timer/key",
SchedPolicyType: api.SchedEventInterval,
SchedPolicyExpr: "1h",
HookClass: "your.hook.class.name",
Watermark: time.Now(),
Data: []byte("yourdata"),
Enable: true,
})
if err != nil {
// handle err
}
fmt.Printf("created timer id: %s\n", timer.ID)
The specified Key
field is unique under a namespace. If you want to create a timer with the same key, you must delete the old one with the same key first if it exists.
To custom the timer schedule policy, you can specify the values of fields SchedPolicyType
and SchedPolicyExpr
. Currently, we support 2 types of schedule policy:
INTERVAL
, triggers with a fixed interval, for example,1h
means every 1 hour.CRON
, triggers with a cron expression, for example,0 0 0 * * *
means every day at 00:00:00.
The field HookClass
is optional. It tells the framework which hook class to use to trigger the action. If you do not specify it, the framework will set the timer's event status to TRIGGER
without calling any custom hook class code.
The field Watermark
is optional. If it is set, the timer will not trigger until there is an event not triggered between the watermark and now. If it is not set, the timer will try triggering the timer immediately after it is created.
The field Enable
is optional. But if you want to keep the timer to schedule the new events, you must set it to true
. If you set this value to false
, the timer will stop scheduling any new event until you set it to true
again.
The Data
field is also optional, you can set any binary data you want. When the timer triggers, the framework will pass the data to the hook class.
When a timer is created successfully, the field ID
will be assigned automatically. You can use this ID
to do some operations on the timer.
Query timers with the client
You can use the client to query timer(s), for example:
// query timers with the specified key prefix
timers, err := client.GetTimers(context.TODO(), api.WithKeyPrefix("/your/timer/"))
if err != nil {
// handle err ...
}
// query timer with the specified key
timer, err := client.GetTimerByKey(ctx, "/your/timer/key")
if err != nil {
// handle err ...
}
// query timer with the specified id
timer, err := client.GetTimerByID(ctx, "123")
if err != nil {
// handle err ...
}
Update a timer
You can update a timer's metadata with the method UpdateTimer
, for example:
// update the schedule policy
if err := client.UpdateTimer(ctx, timer.ID, api.WithSetSchedExpr(api.SchedEventCron, "0 0 * * * *")); err != nil {
// handle error
}
// update the watermark
if err := client.UpdateTimer(ctx, timer.ID, api.WithSetWatermark(time.Now())); err != nil {
// handle error
}
Delete a timer
You can delete a timer with the method DeleteTimer
, for example:
exist, err := client.DeleteTimer(ctx, timer.ID)
if err != nil {
// handle error
}
An extra value exist
is returned. If the timer exists, it will be deleted and exist
will be true
. Otherwise, exist
will be false
.
Set up a runtime to schedule timers
If you only create a timer through a client, the timer will not run by default. You must set up a timer runtime to schedule the timers. You can create a new runtime.TimerGroupRuntime
and then start it, for example:
package main
import "github.com/pingcap/tidb/timer/api"
func main() {
ctx := ... // some go context
store := ... // Create a timer store.
rt := runtime.NewTimerRuntimeBuilder("myGroup", store).
RegisterHookFactory("your.hook.class.name", MyTimerHookFactory).
SetCond(&api.TimerCond{
Key: api.NewOptionalVal("/"), KeyPrefix: true,
}).
Build()
rt.Start()
defer rt.Stop()
<-ctx.Done()
}
We use a builder to build a new timer. You can specify a group name when creating it. The group name is only used by monitoring now. RegisterHookFactory
registers your custom hook class which will be called when the runtime trys to trigger a event. SetCond
sets a condition and the runtime will only schedule the timers which match the condition. If you do not set the condition, the runtime will schedule all the timers.
If you start more than one timer runtimes with overlapped conditions, the hook functions have a probability to be called more than once. So it's better to use some distributed lock to ensure that only one runtime is running.
In the future, we are going to provide some preserved runtimes in TiDB system, so that you can use them directly without creating a new one.
Custom your hook class
You can implement your own hook class by implementing the interface api.HookClass
, for example:
func MyTimerHookFactory(hookClass string, cli api.TimerClient) api.Hook {
return &MyTimerHook{
cli: cli,
}
}
type MyTimerHook struct {
cli api.TimerClient
}
func (h *MyTimerHook) Start() {
// You should do some init works here.
}
func (h *MyTimerHook) Stop() {
// You should do some clear works here.
}
func (h *MyTimerHook) OnPreSchedEvent(ctx context.Context, event api.TimerShedEvent) (api.PreSchedEventResult, error) {
fmt.Printf("OnPreSchedEvent: %s\n", event.EventID())
return api.PreSchedEventResult{}, nil
}
func (h *MyTimerHook) OnSchedEvent(ctx context.Context, event api.TimerShedEvent) error {
fmt.Printf("OnSchedEvent: %s, event start: %s\n", event.EventID(), event.Timer().EventStart)
return h.cli.CloseTimerEvent(ctx, event.Timer().ID, event.EventID())
}
The function MyTimerHookFactory
is used as a parameter of RegisterHookFactory
in the previous section. It is used to create a new hook instance when the runtime needs it. A parameter cli
with type api.TimerClient
will be provided to help to construct the hook, and the hook can use the client to do some further timer operations.
All the timers with the same hook class will share the same hook instance. When the runtime finds a timer that is ready to trigger, it will first find whether the corresponding hook is created. If not, it will call the hook factory to create a new hook instance. Then it will call the hook's Start
method to do some preparation work. The Start
method will only be called once. After that, for each trigger action, the runtime will call the hook's OnPreSchedEvent
method to do some pre-trigger works.
OnPreSchedEvent
returns a PreSchedEventResult
object to tell the runtime what to do next. In most cases, we just need to return an empty PreSchedEventResult
to continue the trigger action. But sometimes you may want to delay the schedule for some reason such as some schedule window limits required by business. You can delay the schedule like this:
func shouldDelaySchedule() bool {
// your function to check whether to delay a timer's schedule
}
func (h *MyTimerHook) OnPreSchedEvent(ctx context.Context, event api.TimerShedEvent) (api.PreSchedEventResult, error) {
if shouldDelaySchedule() {
return api.PreSchedEventResult{
Delay: time.Minute,
}, nil
}
return api.PreSchedEventResult{}, nil
}
In the above example, the delay time is 1 minute. After 1 minute, OnPreSchedEvent
will be called again, and the trigger action will not continue until the method OnPreSchedEvent
returns no delay.
If you want to carry some extra data to the trigger action, you can set the PreSchedEventResult
's EventData
field. The data will be passed to the trigger action. For example:
func (h *MyTimerHook) OnPreSchedEvent(ctx context.Context, event api.TimerShedEvent) (api.PreSchedEventResult, error) {
if shouldDelaySchedule() {
return api.PreSchedEventResult{
EventData: []byte("mycustomeventdata"),
}, nil
}
return api.PreSchedEventResult{}, nil
}
The EventData
will be persisted before the action is actually triggered. So you can use it to do some recovery work if the TiDB process restarts.
After PreSchedEventResult
indicates to continue the trigger action, the event meta including the EventData
will then be persisted to the store. After that, if you query this timer with the client, you can see the EventStatus
changes to TRIGGER
from IDLE
, and the EventID
is also not empty.
Then the hook method OnSchedEvent
will be called. You can do some real trigger action here. For example, you can send a message to a message queue or submit some job to an outer job system. If your job is not very heavy, you can even do it directly in this method. Please notice that the method OnSchedEvent
will only be called once if it returns no error. Only some cases may cause the method OnSchedEvent
to be called again:
- The method
OnSchedEvent
returns an error. - The runtime restarts and finds a timer's
EventStatus
isTRIGGER
.
You should close the timer event by calling the client's method CloseTimerEvent
to reset the timer's state to IDLE
, or the timer will not be able to schedule new events. By default, closing a timer will also reset the Watermark
field to the value of EventStart
, but you can specify the next watermark manually. For example:
func (h *MyTimerHook) OnSchedEvent(ctx context.Context, event api.TimerShedEvent) error {
// do some things ...
// close the timer event so that the next event can be scheduled
return h.cli.CloseTimerEvent(
ctx,
event.Timer().ID,
event.EventID(),
api.WithSetWatermark(time.Now()), // use the current time as the next watermark instead of the event start time
)
}
FAQs
-
Q: What is the difference between the attributes
ID
andKey
of the timer, they are all unique. A: The key difference is that theKey
is specified by the user but theID
is generated by the framework. TheID
is the real identifier of a timer because it is "absolutely" unique because any two timers cannot have the sameID
even if one timer is deleted. But it is allowed to create a timer with aKey
that is the same as another timer which deleted before, and it is also allowed to create timers with the sameKey
in different namespaces. So it is better to useID
to log events from a certain timer. ButKey
is also useful because it can be meaningful.Key
can be used to do some semantic works like querying or ensuring the idempotent creation of timers for certain uses. -
Q: What is the behavior of the timer framework if the TiDB restarts, especially if it shuts down for a long time? A: When the TiDB restarts (which means the timer runtime restarts), the runtime will reload all timers from the store and check their state. If a timer's
EventStatus
isTRIGGER
, the runtime will try to trigger the event again by calling the hook's methodOnSchedEvent
(OnPreSchedEvent
will not be called). If theEventStatus
isIDLE
, the runtime will try to trigger the event if the timer is time to trigger. Notice that when a runtime shuts down for a long time, there may be many time points to trigger betweenWatermark
and the current time, but the hook will still be triggered only once. For example, one timer has a cron expression0 * * * * *
indicating that it should trigger every one hour. If the TiDB shuts down, leave this timer with a watermark2022-01-01 09:58:00
and then restart at2022-01-01 12:01:00
, and it tries to trigger the timer. However, there are 3 time points which are not triggered. The hook will still be invoked once with theEventStart
value2022-01-01 12:01:00
andWatmerkark
value2022-01-01 09:58:00
. It is up to the hook class to determine how to handle this case. For example, it can do the trigger action 3 times in oneOnSchedEvent
call or just do it once and then close the timer with the new watermark2022-01-01 10:00:00
. For the second case, the timer will be triggered again after a very short time because the next expected trigger time2022-01-01 11:00:00
is still before the current time. -
Q: Is it thread-safe for the hook call such as
Start
,Stop
,OnPreSchedEvent
,OnSchedEvent
? A: Yes, it is thread-safe. They are called in one goroutine now, there is no data race without a lock for variables that are only read or written in these methods. But if you create a goroutine manually and access these variables in the background, you should also use a lock to protect them. In the future, the trigger method such asOnPreSchedEvent
andOnSchedEvent
may be called in concurrency of different timers, but it will also be guaranteed that these methods for the same timer will be called in sequence and guarantees the happens-before semantics. -
Q: Will the hook calls from different timers be blocked by each other if some hook calls take a long time? A: Yes, it may happen if two timers share the same hook and one timer's call is slow. The reason is timers that share one hook will be triggered in one goroutine (or maybe a pool with some fixed goroutines in the future). To resolve this issue, you can just do the trigger action background and return a non-error value immediately in the foreground call. Notice that if you do this, you should retry yourself when the trigger action fails.