Documentation ¶
Overview ¶
Package rl provides core infrastructure for dopamine neuromodulation and reinforcement learning, including the Rescorla-Wagner learning algorithm (RW) and Temporal Differences (TD) learning, and a minimal `ClampDaLayer` that can be used to send an arbitrary DA signal.
- `da.go` defines a simple `DALayer` interface for getting and setting dopamine values, and a `SendDA` list of layer names that has convenience methods, and ability to send dopamine to any layer that implements the DALayer interface.
- The RW and TD DA layers use the `CyclePost` layer-level method to send the DA to other layers, at end of each cycle, after activation is updated. Thus, DA lags by 1 cycle, which typically should not be a problem.
- See the separate `pvlv` package for the full biologically-based pvlv model on top of this basic DA infrastructure.
Index ¶
- Constants
- Variables
- func AddRWLayers(nt *axon.Network, prefix string, rel relpos.Relations, space float32) (rew, rp, da axon.AxonLayer)
- func AddTDLayers(nt *axon.Network, prefix string, rel relpos.Relations, space float32) (rew, rp, ri, td axon.AxonLayer)
- func MaxAbsActFmLayers(net emer.Network, lnms emer.LayNames) float32
- func SetNeuronExtPosNeg(nrn *axon.Neuron, ni int, val float32)
- type AChLayer
- type ClampAChLayer
- type ClampDaLayer
- type DALayer
- type Layer
- func (ly *Layer) Class() string
- func (ly *Layer) Defaults()
- func (ly *Layer) GetDA() float32
- func (ly *Layer) InitActs()
- func (ly *Layer) SetDA(da float32)
- func (ly *Layer) UnitVal1D(varIdx int, idx int) float32
- func (ly *Layer) UnitVarIdx(varNm string) (int, error)
- func (ly *Layer) UnitVarNum() int
- type LayerType
- type Network
- func (nt *Network) AddClampDaLayer(name string) *ClampDaLayer
- func (nt *Network) AddRSalienceLayer(name string) *RSalienceLayer
- func (nt *Network) AddRWLayers(prefix string, rel relpos.Relations, space float32) (rew, rp, da axon.AxonLayer)
- func (nt *Network) AddTDLayers(prefix string, rel relpos.Relations, space float32) (rew, rp, ri, td axon.AxonLayer)
- func (nt *Network) UnitVarNames() []string
- type RSalienceLayer
- func (ly *RSalienceLayer) ActFmG(ltime *axon.Time)
- func (ly *RSalienceLayer) Build() error
- func (ly *RSalienceLayer) CyclePost(ltime *axon.Time)
- func (ly *RSalienceLayer) Defaults()
- func (ly *RSalienceLayer) GetACh() float32
- func (ly *RSalienceLayer) MaxAbsRew() float32
- func (ly *RSalienceLayer) SetACh(ach float32)
- func (ly *RSalienceLayer) UnitVal1D(varIdx int, idx int) float32
- func (ly *RSalienceLayer) UnitVarIdx(varNm string) (int, error)
- func (ly *RSalienceLayer) UnitVarNum() int
- type RWDaLayer
- type RWPredLayer
- type RWPrjn
- type RewLayer
- type SendACh
- type SendDA
- type TDDaLayer
- func (ly *TDDaLayer) ActFmG(ltime *axon.Time)
- func (ly *TDDaLayer) Build() error
- func (ly *TDDaLayer) CyclePost(ltime *axon.Time)
- func (ly *TDDaLayer) Defaults()
- func (ly *TDDaLayer) GFmSpike(ltime *axon.Time)
- func (ly *TDDaLayer) RewIntegDA(ltime *axon.Time) float32
- func (ly *TDDaLayer) RewIntegLayer() (*TDRewIntegLayer, error)
- type TDRewIntegLayer
- func (ly *TDRewIntegLayer) ActFmG(ltime *axon.Time)
- func (ly *TDRewIntegLayer) Build() error
- func (ly *TDRewIntegLayer) Defaults()
- func (ly *TDRewIntegLayer) GFmSpike(ltime *axon.Time)
- func (ly *TDRewIntegLayer) RewLayer() (*RewLayer, error)
- func (ly *TDRewIntegLayer) RewPredAct(ltime *axon.Time) float32
- func (ly *TDRewIntegLayer) RewPredLayer() (*TDRewPredLayer, error)
- type TDRewIntegParams
- type TDRewPredLayer
- type TDRewPredPrjn
Constants ¶
const ( // RL is a reinforcement learning layer of any sort RL emer.LayerType = emer.LayerType(deep.LayerTypeN) + iota // RSalience is a reward salience coding layer sending ACh RSalience )
Variables ¶
var ( // NeuronVars are extra neuron variables for pcore NeuronVars = []string{"DA"} // NeuronVarsAll is the pcore collection of all neuron-level vars NeuronVarsAll []string )
var KiT_ClampAChLayer = kit.Types.AddType(&ClampAChLayer{}, LayerProps)
var KiT_ClampDaLayer = kit.Types.AddType(&ClampDaLayer{}, LayerProps)
var KiT_Layer = kit.Types.AddType(&Layer{}, LayerProps)
var KiT_LayerType = kit.Enums.AddEnumExt(deep.KiT_LayerType, LayerTypeN, kit.NotBitFlag, nil)
var KiT_Network = kit.Types.AddType(&Network{}, NetworkProps)
var KiT_RSalienceLayer = kit.Types.AddType(&RSalienceLayer{}, LayerProps)
var KiT_RWDaLayer = kit.Types.AddType(&RWDaLayer{}, deep.LayerProps)
var KiT_RWPredLayer = kit.Types.AddType(&RWPredLayer{}, LayerProps)
var KiT_RewLayer = kit.Types.AddType(&RewLayer{}, LayerProps)
var KiT_TDDaLayer = kit.Types.AddType(&TDDaLayer{}, LayerProps)
var KiT_TDRewIntegLayer = kit.Types.AddType(&TDRewIntegLayer{}, LayerProps)
var KiT_TDRewPredLayer = kit.Types.AddType(&TDRewPredLayer{}, LayerProps)
var KiT_TDRewPredPrjn = kit.Types.AddType(&TDRewPredPrjn{}, axon.PrjnProps)
var LayerProps = ki.Props{ "EnumType:Typ": KiT_LayerType, "ToolBar": ki.PropSlice{ {"Defaults", ki.Props{ "icon": "reset", "desc": "return all parameters to their intial default values", }}, {"InitWts", ki.Props{ "icon": "update", "desc": "initialize the layer's weight values according to prjn parameters, for all *sending* projections out of this layer", }}, {"InitActs", ki.Props{ "icon": "update", "desc": "initialize the layer's activation values", }}, {"sep-act", ki.BlankProp{}}, {"LesionNeurons", ki.Props{ "icon": "close", "desc": "Lesion (set the Off flag) for given proportion of neurons in the layer (number must be 0 -- 1, NOT percent!)", "Args": ki.PropSlice{ {"Proportion", ki.Props{ "desc": "proportion (0 -- 1) of neurons to lesion", }}, }, }}, {"UnLesionNeurons", ki.Props{ "icon": "reset", "desc": "Un-Lesion (reset the Off flag) for all neurons in the layer", }}, }, }
LayerProps are required to get the extended EnumType
var NetworkProps = axon.NetworkProps
Functions ¶
func AddRWLayers ¶
func AddRWLayers(nt *axon.Network, prefix string, rel relpos.Relations, space float32) (rew, rp, da axon.AxonLayer)
AddRWLayers adds simple Rescorla-Wagner (PV only) dopamine system, with a primary Reward layer, a RWPred prediction layer, and a dopamine layer that computes diff. Only generates DA when Rew layer has external input -- otherwise zero.
func AddTDLayers ¶
func AddTDLayers(nt *axon.Network, prefix string, rel relpos.Relations, space float32) (rew, rp, ri, td axon.AxonLayer)
AddTDLayers adds the standard TD temporal differences layers, generating a DA signal. Projection from Rew to RewInteg is given class TDRewToInteg -- should have no learning and 1 weight.
func MaxAbsActFmLayers ¶ added in v1.5.12
MaxAbsActFmLayers returns the maximum absolute value of layer activations from an emer.LayNames list of layers. Iterates over neurons in RewLayer because Inhib.Act.Max does not deal with negative numbers.
Types ¶
type AChLayer ¶
type AChLayer interface { // GetACh returns the acetylcholine level for layer GetACh() float32 // SetACh sets the acetylcholine level for layer SetACh(ach float32) }
AChLayer is an interface for a layer with acetylcholine neuromodulator on it
type ClampAChLayer ¶
type ClampAChLayer struct { axon.Layer SendACh SendACh `desc:"list of layers to send acetylcholine to"` ACh float32 `desc:"acetylcholine value for this layer"` }
ClampAChLayer is an Input layer that just sends its activity as the acetylcholine signal
func (*ClampAChLayer) Build ¶
func (ly *ClampAChLayer) Build() error
Build constructs the layer state, including calling Build on the projections.
func (*ClampAChLayer) CyclePost ¶
func (ly *ClampAChLayer) CyclePost(ltime *axon.Time)
CyclePost is called at end of Cycle We use it to send ACh, which will then be active for the next cycle of processing.
func (*ClampAChLayer) GetACh ¶
func (ly *ClampAChLayer) GetACh() float32
func (*ClampAChLayer) SetACh ¶
func (ly *ClampAChLayer) SetACh(ach float32)
type ClampDaLayer ¶
ClampDaLayer is an Input layer that just sends its activity as the dopamine signal
func AddClampDaLayer ¶
func AddClampDaLayer(nt *axon.Network, name string) *ClampDaLayer
AddClampDaLayer adds a ClampDaLayer of given name
func (*ClampDaLayer) ActFmG ¶ added in v1.5.1
func (ly *ClampDaLayer) ActFmG(ltime *axon.Time)
func (*ClampDaLayer) Build ¶
func (ly *ClampDaLayer) Build() error
Build constructs the layer state, including calling Build on the projections.
func (*ClampDaLayer) CyclePost ¶
func (ly *ClampDaLayer) CyclePost(ltime *axon.Time)
CyclePost is called at end of Cycle We use it to send DA, which will then be active for the next cycle of processing.
func (*ClampDaLayer) Defaults ¶
func (ly *ClampDaLayer) Defaults()
type DALayer ¶
type DALayer interface { // GetDA returns the dopamine level for layer GetDA() float32 // SetDA sets the dopamine level for layer SetDA(da float32) }
DALayer is an interface for a layer with dopamine neuromodulator on it
type Layer ¶ added in v1.4.14
Layer is the base layer type for RL framework. Adds a dopamine variable to base Axon layer type.
func (*Layer) UnitVal1D ¶ added in v1.4.14
UnitVal1D returns value of given variable index on given unit, using 1-dimensional index. returns NaN on invalid index. This is the core unit var access method used by other methods, so it is the only one that needs to be updated for derived layer types.
func (*Layer) UnitVarIdx ¶ added in v1.4.14
UnitVarIdx returns the index of given variable within the Neuron, according to UnitVarNames() list (using a map to lookup index), or -1 and error message if not found.
func (*Layer) UnitVarNum ¶ added in v1.4.14
UnitVarNum returns the number of Neuron-level variables for this layer. This is needed for extending indexes in derived types.
type LayerType ¶ added in v1.5.10
LayerType has the extensions to the emer.LayerType types, for gui
const ( RL_ LayerType = LayerType(deep.LayerTypeN) + iota RSalience_ LayerTypeN )
gui versions
func StringToLayerType ¶ added in v1.5.10
type Network ¶ added in v1.4.14
rl.Network enables display of the Da variable for pure rl models
func (*Network) AddClampDaLayer ¶ added in v1.4.14
func (nt *Network) AddClampDaLayer(name string) *ClampDaLayer
AddClampDaLayer adds a ClampDaLayer of given name
func (*Network) AddRSalienceLayer ¶ added in v1.5.12
func (nt *Network) AddRSalienceLayer(name string) *RSalienceLayer
AddRSalienceLayer adds a rl.RSalienceLayer unsigned reward salience coding ACh layer.
func (*Network) AddRWLayers ¶ added in v1.4.14
func (nt *Network) AddRWLayers(prefix string, rel relpos.Relations, space float32) (rew, rp, da axon.AxonLayer)
AddRWLayers adds simple Rescorla-Wagner (PV only) dopamine system, with a primary Reward layer, a RWPred prediction layer, and a dopamine layer that computes diff. Only generates DA when Rew layer has external input -- otherwise zero.
func (*Network) AddTDLayers ¶ added in v1.4.14
func (nt *Network) AddTDLayers(prefix string, rel relpos.Relations, space float32) (rew, rp, ri, td axon.AxonLayer)
AddTDLayers adds the standard TD temporal differences layers, generating a DA signal. Projection from Rew to RewInteg is given class TDRewToInteg -- should have no learning and 1 weight.
func (*Network) UnitVarNames ¶ added in v1.4.14
UnitVarNames returns a list of variable names available on the units in this layer
type RSalienceLayer ¶ added in v1.5.12
type RSalienceLayer struct { axon.Layer RewThr float32 `` /* 166-byte string literal not displayed */ RewLayers emer.LayNames `desc:"Reward-representing layer(s) from which this computes ACh as Max absolute value"` SendACh SendACh `desc:"list of layers to send acetylcholine to"` ACh float32 `desc:"acetylcholine value for this layer"` }
RSalienceLayer reads reward signals from named source layer(s) and sends the Max absolute value of that activity as the positively-rectified non-prediction-discounted reward salience signal, and sent as an acetylcholine (ACh) signal. To handle positive-only reward signals, need to include both a reward prediction and reward outcome layer.
func AddRSalienceLayer ¶ added in v1.5.12
func AddRSalienceLayer(nt *axon.Network, name string) *RSalienceLayer
AddRSalienceLayer adds a RSalienceLayer unsigned reward salience coding ACh layer.
func (*RSalienceLayer) ActFmG ¶ added in v1.5.12
func (ly *RSalienceLayer) ActFmG(ltime *axon.Time)
func (*RSalienceLayer) Build ¶ added in v1.5.12
func (ly *RSalienceLayer) Build() error
Build constructs the layer state, including calling Build on the projections.
func (*RSalienceLayer) CyclePost ¶ added in v1.5.12
func (ly *RSalienceLayer) CyclePost(ltime *axon.Time)
CyclePost is called at end of Cycle We use it to send ACh, which will then be active for the next cycle of processing.
func (*RSalienceLayer) Defaults ¶ added in v1.5.12
func (ly *RSalienceLayer) Defaults()
func (*RSalienceLayer) GetACh ¶ added in v1.5.12
func (ly *RSalienceLayer) GetACh() float32
func (*RSalienceLayer) MaxAbsRew ¶ added in v1.5.12
func (ly *RSalienceLayer) MaxAbsRew() float32
MaxAbsRew returns the maximum absolute value of reward layer activations
func (*RSalienceLayer) SetACh ¶ added in v1.5.12
func (ly *RSalienceLayer) SetACh(ach float32)
func (*RSalienceLayer) UnitVal1D ¶ added in v1.5.12
func (ly *RSalienceLayer) UnitVal1D(varIdx int, idx int) float32
UnitVal1D returns value of given variable index on given unit, using 1-dimensional index. returns NaN on invalid index. This is the core unit var access method used by other methods, so it is the only one that needs to be updated for derived layer types.
func (*RSalienceLayer) UnitVarIdx ¶ added in v1.5.12
func (ly *RSalienceLayer) UnitVarIdx(varNm string) (int, error)
UnitVarIdx returns the index of given variable within the Neuron, according to UnitVarNames() list (using a map to lookup index), or -1 and error message if not found.
func (*RSalienceLayer) UnitVarNum ¶ added in v1.5.12
func (ly *RSalienceLayer) UnitVarNum() int
UnitVarNum returns the number of Neuron-level variables for this layer. This is needed for extending indexes in derived types.
type RWDaLayer ¶
type RWDaLayer struct { Layer SendDA SendDA `desc:"list of layers to send dopamine to"` RewLay string `desc:"name of Reward-representing layer from which this computes DA -- if nothing clamped, no dopamine computed"` RWPredLay string `desc:"name of RWPredLayer layer that is subtracted from the reward value"` }
RWDaLayer computes a dopamine (DA) signal based on a simple Rescorla-Wagner learning dynamic (i.e., PV learning in the PVLV framework). It computes difference between r(t) and RWPred values. r(t) is accessed directly from a Rew layer -- if no external input then no DA is computed -- critical for effective use of RW only for PV cases. RWPred prediction is also accessed directly from Rew layer to avoid any issues.
func (*RWDaLayer) Build ¶
Build constructs the layer state, including calling Build on the projections.
type RWPredLayer ¶
type RWPredLayer struct { Layer PredRange minmax.F32 `` /* 180-byte string literal not displayed */ }
RWPredLayer computes reward prediction for a simple Rescorla-Wagner learning dynamic (i.e., PV learning in the PVLV framework). Activity is computed as linear function of excitatory conductance (which can be negative -- there are no constraints). Use with RWPrjn which does simple delta-rule learning on minus-plus.
func (*RWPredLayer) ActFmG ¶
func (ly *RWPredLayer) ActFmG(ltime *axon.Time)
func (*RWPredLayer) Defaults ¶
func (ly *RWPredLayer) Defaults()
type RWPrjn ¶
type RWPrjn struct { axon.Prjn DaTol float32 `` /* 208-byte string literal not displayed */ OppSignLRate float32 `desc:"how much to learn on opposite DA sign coding neuron (0..1)"` }
RWPrjn does dopamine-modulated learning for reward prediction: Da * Send.Act Use in RWPredLayer typically to generate reward predictions. Has no weight bounds or limits on sign etc.
type RewLayer ¶ added in v1.4.14
type RewLayer struct {
Layer
}
RewLayer represents positive or negative reward values across 2 units, showing spiking rates for each, and Act always represents signed value.
func AddRewLayer ¶ added in v1.5.12
AddRewLayer adds a RewLayer of given name
type SendACh ¶
SendACh is a list of layers to send acetylcholine to
func (*SendACh) AddOne ¶
AddOne adds one layer name to list -- python version -- doesn't support varargs
type SendDA ¶
SendDA is a list of layers to send dopamine to
func (*SendDA) AddOne ¶
AddOne adds one layer name to list -- python version -- doesn't support varargs
type TDDaLayer ¶
type TDDaLayer struct { Layer SendDA SendDA `desc:"list of layers to send dopamine to"` RewInteg string `desc:"name of TDRewIntegLayer from which this computes the temporal derivative"` }
TDDaLayer computes a dopamine (DA) signal as the temporal difference (TD) between the TDRewIntegLayer activations in the minus and plus phase.
func (*TDDaLayer) Build ¶
Build constructs the layer state, including calling Build on the projections.
func (*TDDaLayer) CyclePost ¶
CyclePost is called at end of Cycle We use it to send DA, which will then be active for the next cycle of processing.
func (*TDDaLayer) RewIntegDA ¶ added in v1.4.14
func (*TDDaLayer) RewIntegLayer ¶
func (ly *TDDaLayer) RewIntegLayer() (*TDRewIntegLayer, error)
type TDRewIntegLayer ¶
type TDRewIntegLayer struct { Layer RewInteg TDRewIntegParams `desc:"parameters for reward integration"` }
TDRewIntegLayer is the temporal differences reward integration layer. It represents estimated value V(t) in the minus phase, and estimated V(t+1) + r(t) in the plus phase. It directly accesses (t) from Rew layer, and V(t) from RewPred layer.
func (*TDRewIntegLayer) ActFmG ¶
func (ly *TDRewIntegLayer) ActFmG(ltime *axon.Time)
func (*TDRewIntegLayer) Build ¶
func (ly *TDRewIntegLayer) Build() error
Build constructs the layer state, including calling Build on the projections.
func (*TDRewIntegLayer) Defaults ¶
func (ly *TDRewIntegLayer) Defaults()
func (*TDRewIntegLayer) GFmSpike ¶ added in v1.5.12
func (ly *TDRewIntegLayer) GFmSpike(ltime *axon.Time)
func (*TDRewIntegLayer) RewLayer ¶ added in v1.4.14
func (ly *TDRewIntegLayer) RewLayer() (*RewLayer, error)
func (*TDRewIntegLayer) RewPredAct ¶ added in v1.4.14
func (ly *TDRewIntegLayer) RewPredAct(ltime *axon.Time) float32
func (*TDRewIntegLayer) RewPredLayer ¶
func (ly *TDRewIntegLayer) RewPredLayer() (*TDRewPredLayer, error)
type TDRewIntegParams ¶
type TDRewIntegParams struct { Discount float32 `desc:"discount factor -- how much to discount the future prediction from RewPred"` RewPredGain float32 `desc:"gain factor on rew pred activations"` RewPred string `desc:"name of TDRewPredLayer to get reward prediction from "` Rew string `desc:"name of RewLayer to get current reward from "` }
TDRewIntegParams are params for reward integrator layer
func (*TDRewIntegParams) Defaults ¶
func (tp *TDRewIntegParams) Defaults()
type TDRewPredLayer ¶
type TDRewPredLayer struct {
Layer
}
TDRewPredLayer is the temporal differences reward prediction layer. It represents estimated value V(t) in the minus phase, and computes estimated V(t+1) based on its learned weights in plus phase. Use TDRewPredPrjn for DA modulated learning.
func (*TDRewPredLayer) ActFmG ¶
func (ly *TDRewPredLayer) ActFmG(ltime *axon.Time)
func (*TDRewPredLayer) Defaults ¶ added in v1.4.14
func (ly *TDRewPredLayer) Defaults()
type TDRewPredPrjn ¶
type TDRewPredPrjn struct { axon.Prjn OppSignLRate float32 `desc:"how much to learn on opposite DA sign coding neuron (0..1)"` }
TDRewPredPrjn does dopamine-modulated learning for reward prediction: DWt = Da * Send.SpkPrv (activity on *previous* timestep) Use in TDRewPredLayer typically to generate reward predictions. If the Da sign is positive, the first recv unit learns fully; for negative, second one learns fully. Lower lrate applies for opposite cases. Weights are positive-only.
func (*TDRewPredPrjn) DWt ¶
func (pj *TDRewPredPrjn) DWt(ltime *axon.Time)
DWt computes the weight change (learning) -- on sending projections.
func (*TDRewPredPrjn) Defaults ¶
func (pj *TDRewPredPrjn) Defaults()
func (*TDRewPredPrjn) WtFmDWt ¶
func (pj *TDRewPredPrjn) WtFmDWt(ltime *axon.Time)
WtFmDWt updates the synaptic weight values from delta-weight changes -- on sending projections