Documentation ¶
Overview ¶
Package rl provides core infrastructure for dopamine neuromodulation and reinforcement learning, including the Rescorla-Wagner learning algorithm (RW) and Temporal Differences (TD) learning, and a minimal `ClampDaLayer` that can be used to send an arbitrary DA signal.
- `da.go` defines a simple `DALayer` interface for getting and setting dopamine values, and a `SendDA` list of layer names that has convenience methods, and ability to send dopamine to any layer that implements the DALayer interface.
- The RW and TD DA layers use the `CyclePost` layer-level method to send the DA to other layers, at end of each cycle, after activation is updated. Thus, DA lags by 1 cycle, which typically should not be a problem.
- See the separate `pvlv` package for the full biologically-based pvlv model on top of this basic DA infrastructure.
Index ¶
- Variables
- func AddRWLayers(nt *axon.Network, prefix string, rel relpos.Relations, space float32) (rew, rp, da axon.AxonLayer)
- func AddRWLayersPy(nt *axon.Network, prefix string, rel relpos.Relations, space float32) []axon.AxonLayer
- func AddTDLayers(nt *axon.Network, prefix string, rel relpos.Relations, space float32) (rew, rp, ri, td axon.AxonLayer)
- func AddTDLayersPy(nt *axon.Network, prefix string, rel relpos.Relations, space float32) []axon.AxonLayer
- func SetNeuronExtPosNeg(nrn *axon.Neuron, ni int, val float32)
- type AChLayer
- type ClampAChLayer
- type ClampDaLayer
- type DALayer
- type Layer
- type Network
- func (nt *Network) AddClampDaLayer(name string) *ClampDaLayer
- func (nt *Network) AddRWLayers(prefix string, rel relpos.Relations, space float32) (rew, rp, da axon.AxonLayer)
- func (nt *Network) AddTDLayers(prefix string, rel relpos.Relations, space float32) (rew, rp, ri, td axon.AxonLayer)
- func (nt *Network) UnitVarNames() []string
- type RWDaLayer
- type RWPredLayer
- type RWPrjn
- type RewLayer
- type SendACh
- type SendDA
- type TDDaLayer
- func (ly *TDDaLayer) ActFmG(ltime *axon.Time)
- func (ly *TDDaLayer) Build() error
- func (ly *TDDaLayer) CyclePost(ltime *axon.Time)
- func (ly *TDDaLayer) Defaults()
- func (ly *TDDaLayer) GFmInc(ltime *axon.Time)
- func (ly *TDDaLayer) RewIntegDA(ltime *axon.Time) float32
- func (ly *TDDaLayer) RewIntegLayer() (*TDRewIntegLayer, error)
- type TDRewIntegLayer
- func (ly *TDRewIntegLayer) ActFmG(ltime *axon.Time)
- func (ly *TDRewIntegLayer) Build() error
- func (ly *TDRewIntegLayer) Defaults()
- func (ly *TDRewIntegLayer) GFmInc(ltime *axon.Time)
- func (ly *TDRewIntegLayer) RewLayer() (*RewLayer, error)
- func (ly *TDRewIntegLayer) RewPredAct(ltime *axon.Time) float32
- func (ly *TDRewIntegLayer) RewPredLayer() (*TDRewPredLayer, error)
- type TDRewIntegParams
- type TDRewPredLayer
- type TDRewPredPrjn
Constants ¶
This section is empty.
Variables ¶
var ( // NeuronVars are extra neuron variables for pcore NeuronVars = []string{"DA"} // NeuronVarsAll is the pcore collection of all neuron-level vars NeuronVarsAll []string )
var KiT_ClampAChLayer = kit.Types.AddType(&ClampAChLayer{}, axon.LayerProps)
var KiT_ClampDaLayer = kit.Types.AddType(&ClampDaLayer{}, axon.LayerProps)
var KiT_Layer = kit.Types.AddType(&Layer{}, axon.LayerProps)
var KiT_Network = kit.Types.AddType(&Network{}, NetworkProps)
var KiT_RWDaLayer = kit.Types.AddType(&RWDaLayer{}, deep.LayerProps)
var KiT_RWPredLayer = kit.Types.AddType(&RWPredLayer{}, axon.LayerProps)
var KiT_RewLayer = kit.Types.AddType(&RewLayer{}, axon.LayerProps)
var KiT_TDDaLayer = kit.Types.AddType(&TDDaLayer{}, axon.LayerProps)
var KiT_TDRewIntegLayer = kit.Types.AddType(&TDRewIntegLayer{}, axon.LayerProps)
var KiT_TDRewPredLayer = kit.Types.AddType(&TDRewPredLayer{}, axon.LayerProps)
var KiT_TDRewPredPrjn = kit.Types.AddType(&TDRewPredPrjn{}, axon.PrjnProps)
var NetworkProps = axon.NetworkProps
Functions ¶
func AddRWLayers ¶
func AddRWLayers(nt *axon.Network, prefix string, rel relpos.Relations, space float32) (rew, rp, da axon.AxonLayer)
AddRWLayers adds simple Rescorla-Wagner (PV only) dopamine system, with a primary Reward layer, a RWPred prediction layer, and a dopamine layer that computes diff. Only generates DA when Rew layer has external input -- otherwise zero.
func AddRWLayersPy ¶
func AddRWLayersPy(nt *axon.Network, prefix string, rel relpos.Relations, space float32) []axon.AxonLayer
AddRWLayersPy adds simple Rescorla-Wagner (PV only) dopamine system, with a primary Reward layer, a RWPred prediction layer, and a dopamine layer that computes diff. Only generates DA when Rew layer has external input -- otherwise zero. Py is Python version, returns layers as a slice
func AddTDLayers ¶
func AddTDLayers(nt *axon.Network, prefix string, rel relpos.Relations, space float32) (rew, rp, ri, td axon.AxonLayer)
AddTDLayers adds the standard TD temporal differences layers, generating a DA signal. Projection from Rew to RewInteg is given class TDRewToInteg -- should have no learning and 1 weight.
func AddTDLayersPy ¶
func AddTDLayersPy(nt *axon.Network, prefix string, rel relpos.Relations, space float32) []axon.AxonLayer
AddTDLayersPy adds the standard TD temporal differences layers, generating a DA signal. Projection from Rew to RewInteg is given class TDRewToInteg -- should have no learning and 1 weight. Py is Python version, returns layers as a slice
Types ¶
type AChLayer ¶
type AChLayer interface { // GetACh returns the acetylcholine level for layer GetACh() float32 // SetACh sets the acetylcholine level for layer SetACh(ach float32) }
AChLayer is an interface for a layer with acetylcholine neuromodulator on it
type ClampAChLayer ¶
type ClampAChLayer struct { axon.Layer SendACh SendACh `desc:"list of layers to send acetylcholine to"` ACh float32 `desc:"acetylcholine value for this layer"` }
ClampAChLayer is an Input layer that just sends its activity as the acetylcholine signal
func (*ClampAChLayer) Build ¶
func (ly *ClampAChLayer) Build() error
Build constructs the layer state, including calling Build on the projections.
func (*ClampAChLayer) CyclePost ¶
func (ly *ClampAChLayer) CyclePost(ltime *axon.Time)
CyclePost is called at end of Cycle We use it to send ACh, which will then be active for the next cycle of processing.
func (*ClampAChLayer) GetACh ¶
func (ly *ClampAChLayer) GetACh() float32
func (*ClampAChLayer) SetACh ¶
func (ly *ClampAChLayer) SetACh(ach float32)
type ClampDaLayer ¶
type ClampDaLayer struct { axon.Layer SendDA SendDA `desc:"list of layers to send dopamine to"` DA float32 `desc:"dopamine value for this layer"` }
ClampDaLayer is an Input layer that just sends its activity as the dopamine signal
func AddClampDaLayer ¶
func AddClampDaLayer(nt *axon.Network, name string) *ClampDaLayer
AddClampDaLayer adds a ClampDaLayer of given name
func (*ClampDaLayer) Build ¶
func (ly *ClampDaLayer) Build() error
Build constructs the layer state, including calling Build on the projections.
func (*ClampDaLayer) CyclePost ¶
func (ly *ClampDaLayer) CyclePost(ltime *axon.Time)
CyclePost is called at end of Cycle We use it to send DA, which will then be active for the next cycle of processing.
func (*ClampDaLayer) Defaults ¶
func (ly *ClampDaLayer) Defaults()
func (*ClampDaLayer) GetDA ¶
func (ly *ClampDaLayer) GetDA() float32
func (*ClampDaLayer) SetDA ¶
func (ly *ClampDaLayer) SetDA(da float32)
type DALayer ¶
type DALayer interface { // GetDA returns the dopamine level for layer GetDA() float32 // SetDA sets the dopamine level for layer SetDA(da float32) }
DALayer is an interface for a layer with dopamine neuromodulator on it
type Layer ¶ added in v1.4.14
Layer is the base layer type for RL framework. Adds a dopamine variable to base Axon layer type.
func (*Layer) UnitVal1D ¶ added in v1.4.14
UnitVal1D returns value of given variable index on given unit, using 1-dimensional index. returns NaN on invalid index. This is the core unit var access method used by other methods, so it is the only one that needs to be updated for derived layer types.
func (*Layer) UnitVarIdx ¶ added in v1.4.14
UnitVarIdx returns the index of given variable within the Neuron, according to UnitVarNames() list (using a map to lookup index), or -1 and error message if not found.
func (*Layer) UnitVarNum ¶ added in v1.4.14
UnitVarNum returns the number of Neuron-level variables for this layer. This is needed for extending indexes in derived types.
type Network ¶ added in v1.4.14
rl.Network enables display of the Da variable for pure rl models
func (*Network) AddClampDaLayer ¶ added in v1.4.14
func (nt *Network) AddClampDaLayer(name string) *ClampDaLayer
AddClampDaLayer adds a ClampDaLayer of given name
func (*Network) AddRWLayers ¶ added in v1.4.14
func (nt *Network) AddRWLayers(prefix string, rel relpos.Relations, space float32) (rew, rp, da axon.AxonLayer)
AddRWLayers adds simple Rescorla-Wagner (PV only) dopamine system, with a primary Reward layer, a RWPred prediction layer, and a dopamine layer that computes diff. Only generates DA when Rew layer has external input -- otherwise zero.
func (*Network) AddTDLayers ¶ added in v1.4.14
func (nt *Network) AddTDLayers(prefix string, rel relpos.Relations, space float32) (rew, rp, ri, td axon.AxonLayer)
AddTDLayers adds the standard TD temporal differences layers, generating a DA signal. Projection from Rew to RewInteg is given class TDRewToInteg -- should have no learning and 1 weight.
func (*Network) UnitVarNames ¶ added in v1.4.14
UnitVarNames returns a list of variable names available on the units in this layer
type RWDaLayer ¶
type RWDaLayer struct { Layer SendDA SendDA `desc:"list of layers to send dopamine to"` RewLay string `desc:"name of Reward-representing layer from which this computes DA -- if nothing clamped, no dopamine computed"` RWPredLay string `desc:"name of RWPredLayer layer that is subtracted from the reward value"` }
RWDaLayer computes a dopamine (DA) signal based on a simple Rescorla-Wagner learning dynamic (i.e., PV learning in the PVLV framework). It computes difference between r(t) and RWPred values. r(t) is accessed directly from a Rew layer -- if no external input then no DA is computed -- critical for effective use of RW only for PV cases. RWPred prediction is also accessed directly from Rew layer to avoid any issues.
func (*RWDaLayer) Build ¶
Build constructs the layer state, including calling Build on the projections.
type RWPredLayer ¶
type RWPredLayer struct { Layer PredRange minmax.F32 `` /* 180-byte string literal not displayed */ }
RWPredLayer computes reward prediction for a simple Rescorla-Wagner learning dynamic (i.e., PV learning in the PVLV framework). Activity is computed as linear function of excitatory conductance (which can be negative -- there are no constraints). Use with RWPrjn which does simple delta-rule learning on minus-plus.
func (*RWPredLayer) ActFmG ¶
func (ly *RWPredLayer) ActFmG(ltime *axon.Time)
func (*RWPredLayer) Defaults ¶
func (ly *RWPredLayer) Defaults()
type RWPrjn ¶
type RWPrjn struct { axon.Prjn DaTol float32 `` /* 208-byte string literal not displayed */ OppSignLRate float32 `desc:"how much to learn on opposite DA sign coding neuron (0..1)"` }
RWPrjn does dopamine-modulated learning for reward prediction: Da * Send.Act Use in RWPredLayer typically to generate reward predictions. Has no weight bounds or limits on sign etc.
type RewLayer ¶ added in v1.4.14
type RewLayer struct {
Layer
}
RewLayer represents positive or negative reward values across 2 units, showing spiking rates for each, and Act always represents signed value.
type SendACh ¶
SendACh is a list of layers to send acetylcholine to
func (*SendACh) AddOne ¶
AddOne adds one layer name to list -- python version -- doesn't support varargs
type SendDA ¶
SendDA is a list of layers to send dopamine to
func (*SendDA) AddOne ¶
AddOne adds one layer name to list -- python version -- doesn't support varargs
type TDDaLayer ¶
type TDDaLayer struct { Layer SendDA SendDA `desc:"list of layers to send dopamine to"` RewInteg string `desc:"name of TDRewIntegLayer from which this computes the temporal derivative"` }
TDDaLayer computes a dopamine (DA) signal as the temporal difference (TD) between the TDRewIntegLayer activations in the minus and plus phase.
func (*TDDaLayer) Build ¶
Build constructs the layer state, including calling Build on the projections.
func (*TDDaLayer) CyclePost ¶
CyclePost is called at end of Cycle We use it to send DA, which will then be active for the next cycle of processing.
func (*TDDaLayer) RewIntegDA ¶ added in v1.4.14
func (*TDDaLayer) RewIntegLayer ¶
func (ly *TDDaLayer) RewIntegLayer() (*TDRewIntegLayer, error)
type TDRewIntegLayer ¶
type TDRewIntegLayer struct { Layer RewInteg TDRewIntegParams `desc:"parameters for reward integration"` }
TDRewIntegLayer is the temporal differences reward integration layer. It represents estimated value V(t) in the minus phase, and estimated V(t+1) + r(t) in the plus phase. It directly accesses (t) from Rew layer, and V(t) from RewPred layer.
func (*TDRewIntegLayer) ActFmG ¶
func (ly *TDRewIntegLayer) ActFmG(ltime *axon.Time)
func (*TDRewIntegLayer) Build ¶
func (ly *TDRewIntegLayer) Build() error
Build constructs the layer state, including calling Build on the projections.
func (*TDRewIntegLayer) Defaults ¶
func (ly *TDRewIntegLayer) Defaults()
func (*TDRewIntegLayer) GFmInc ¶ added in v1.4.14
func (ly *TDRewIntegLayer) GFmInc(ltime *axon.Time)
func (*TDRewIntegLayer) RewLayer ¶ added in v1.4.14
func (ly *TDRewIntegLayer) RewLayer() (*RewLayer, error)
func (*TDRewIntegLayer) RewPredAct ¶ added in v1.4.14
func (ly *TDRewIntegLayer) RewPredAct(ltime *axon.Time) float32
func (*TDRewIntegLayer) RewPredLayer ¶
func (ly *TDRewIntegLayer) RewPredLayer() (*TDRewPredLayer, error)
type TDRewIntegParams ¶
type TDRewIntegParams struct { Discount float32 `desc:"discount factor -- how much to discount the future prediction from RewPred"` RewPredGain float32 `desc:"gain factor on rew pred activations"` RewPred string `desc:"name of TDRewPredLayer to get reward prediction from "` Rew string `desc:"name of RewLayer to get current reward from "` }
TDRewIntegParams are params for reward integrator layer
func (*TDRewIntegParams) Defaults ¶
func (tp *TDRewIntegParams) Defaults()
type TDRewPredLayer ¶
type TDRewPredLayer struct {
Layer
}
TDRewPredLayer is the temporal differences reward prediction layer. It represents estimated value V(t) in the minus phase, and computes estimated V(t+1) based on its learned weights in plus phase. Use TDRewPredPrjn for DA modulated learning.
func (*TDRewPredLayer) ActFmG ¶
func (ly *TDRewPredLayer) ActFmG(ltime *axon.Time)
func (*TDRewPredLayer) Defaults ¶ added in v1.4.14
func (ly *TDRewPredLayer) Defaults()
type TDRewPredPrjn ¶
type TDRewPredPrjn struct { axon.Prjn OppSignLRate float32 `desc:"how much to learn on opposite DA sign coding neuron (0..1)"` }
TDRewPredPrjn does dopamine-modulated learning for reward prediction: DWt = Da * Send.ActPrv (activity on *previous* timestep) Use in TDRewPredLayer typically to generate reward predictions. If the Da sign is positive, the first recv unit learns fully; for negative, second one learns fully. Lower lrate applies for opposite cases. Weights are positive-only.
func (*TDRewPredPrjn) DWt ¶
func (pj *TDRewPredPrjn) DWt(ltime *axon.Time)
DWt computes the weight change (learning) -- on sending projections.
func (*TDRewPredPrjn) Defaults ¶
func (pj *TDRewPredPrjn) Defaults()
func (*TDRewPredPrjn) WtFmDWt ¶
func (pj *TDRewPredPrjn) WtFmDWt(ltime *axon.Time)
WtFmDWt updates the synaptic weight values from delta-weight changes -- on sending projections