Class RealtimeTruncationRetentionRatio
-
- All Implemented Interfaces:
public final class RealtimeTruncationRetentionRatioRetain a fraction of the conversation tokens when the conversation exceeds the input token limit. This allows you to amortize truncations across multiple turns, which can help improve cached token usage.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description public final classRealtimeTruncationRetentionRatio.BuilderA builder for RealtimeTruncationRetentionRatio.
public final classRealtimeTruncationRetentionRatio.TokenLimitsOptional custom token limits for this truncation strategy. If not provided, the model's default token limits will be used.
-
Method Summary
Modifier and Type Method Description final DoubleretentionRatio()Fraction of post-instruction conversation tokens to retain ( 0.0-1.0) when the conversation exceeds the input token limit.final JsonValue_type()Use retention ratio truncation. final Optional<RealtimeTruncationRetentionRatio.TokenLimits>tokenLimits()Optional custom token limits for this truncation strategy. final JsonField<Double>_retentionRatio()Returns the raw JSON value of retentionRatio. final JsonField<RealtimeTruncationRetentionRatio.TokenLimits>_tokenLimits()Returns the raw JSON value of tokenLimits. final Map<String, JsonValue>_additionalProperties()final RealtimeTruncationRetentionRatio.BuildertoBuilder()final RealtimeTruncationRetentionRatiovalidate()final BooleanisValid()Booleanequals(Object other)IntegerhashCode()StringtoString()final static RealtimeTruncationRetentionRatio.Builderbuilder()Returns a mutable builder for constructing an instance of RealtimeTruncationRetentionRatio. -
-
Method Detail
-
retentionRatio
final Double retentionRatio()
Fraction of post-instruction conversation tokens to retain (
0.0-1.0) when the conversation exceeds the input token limit. Setting this to0.8means that messages will be dropped until 80% of the maximum allowed tokens are used. This helps reduce the frequency of truncations and improve cache rates.
-
_type
final JsonValue _type()
Use retention ratio truncation.
Expected to always return the following:
JsonValue.from("retention_ratio")However, this method can be useful for debugging and logging (e.g. if the server responded with an unexpected value).
-
tokenLimits
final Optional<RealtimeTruncationRetentionRatio.TokenLimits> tokenLimits()
Optional custom token limits for this truncation strategy. If not provided, the model's default token limits will be used.
-
_retentionRatio
final JsonField<Double> _retentionRatio()
Returns the raw JSON value of retentionRatio.
Unlike retentionRatio, this method doesn't throw if the JSON field has an unexpected type.
-
_tokenLimits
final JsonField<RealtimeTruncationRetentionRatio.TokenLimits> _tokenLimits()
Returns the raw JSON value of tokenLimits.
Unlike tokenLimits, this method doesn't throw if the JSON field has an unexpected type.
-
_additionalProperties
final Map<String, JsonValue> _additionalProperties()
-
toBuilder
final RealtimeTruncationRetentionRatio.Builder toBuilder()
-
validate
final RealtimeTruncationRetentionRatio validate()
-
builder
final static RealtimeTruncationRetentionRatio.Builder builder()
Returns a mutable builder for constructing an instance of RealtimeTruncationRetentionRatio.
The following fields are required:
.retentionRatio()
-
-
-
-