vectors¶
- class dialz.SteeringVector(model_type, directions)[source]¶
A per-layer steering direction for activation-level model control.
Steering vectors can be combined arithmetically (
+,-,*,/), serialized to GGUF files, and applied to aSteeringModelviaSteeringModel.set_control().- model_type¶
HuggingFace model type string (e.g.
"mistral").
- directions¶
Mapping from layer index to direction vector.
- directions¶
- export_gguf(path)[source]¶
Export this steering vector to a GGUF file.
- Return type:
None
Note
The GGUF format is not yet supported by llama.cpp for steering vectors. This is a WIP serialisation target.
- Parameters:
path – File path to write the
.gguffile to.
Example:
vector = SteeringVector.train(model, dataset) vector.export_gguf("vector.gguf")
- classmethod import_gguf(path)[source]¶
Load a steering vector from a GGUF file.
- Return type:
- Parameters:
path – Path to the
.gguffile.- Returns:
The deserialized steering vector.
- Raises:
ValueError – If required GGUF fields are missing or malformed.
- model_type¶
- classmethod train(model, dataset, method=Method.PCA, **kwargs)[source]¶
Train a SteeringVector from a contrastive dataset.
A tokenizer is loaded automatically from
model.model_name.- Return type:
- Parameters:
model – The model to train against (must have
model_nameandtokenattributes).dataset – The contrastive dataset used for training.
method – The extraction strategy. Accepts a
Methodenum, a string ("pca","mean_diff", etc.), or any customSteeringStrategycallable. Defaults toMethod.PCA.**kwargs –
Forwarded to
read_representations(). Useful keys:batch_size (int) – max batch size (default 32).
token_index (int) – token position index into non-padding tokens (default
-1, last token).
- Returns:
The trained steering vector.
- class dialz.SteeringModel(model_name, layer_ids, token=None, torch_dtype=torch.float16)[source]¶
This mutates the wrapped `model`! Be careful using `model` after passing it to this class.
A wrapped language model that can have controls set on its layers with self.set_control.
- property config¶
Model configuration (delegates to the wrapped model).
- property device¶
Device the model resides on.
- reset()[source]¶
Resets the control for all layer_ids, returning the model to base behavior.
- Return type:
None
- set_control(control, scalar=1.0, **kwargs)[source]¶
Apply a
SteeringVectorto the controllable layers.- Return type:
None- Parameters:
control – Steering vector whose layer directions will be applied.
scalar – Strength multiplier. Negative values invert the direction (e.g. happiness → sadness).
**kwargs – Passed to
BlockControlParams.normalize(bool) rescales activations to their pre-control magnitude.operator(callable) overrides the default+combination.
- set_raw_control(control, **kwargs)[source]¶
Set or remove control parameters to the layers this ControlModel handles. The keys of control should be equal to or a superset of the layer_ids passed to __init__. Only those layers will be controlled, any others in control will be ignored.
Passing control=None will reset the control tensor for all layer_ids, making the model act like a non-control model.
- Return type:
None
Additional kwargs: - normalize: bool: track the magnitude of the non-modified activation, and rescale the
activation to that magnitude after control (default: False)
operator: Callable[[Tensor, Tensor], Tensor]: how to combine the base output and control (default: +)