Developer
Shay Rojansky
roji@roji.org
Performance
YoY:+9333%Key patterns and highlights from this developer's activity.
Breakdown of growth, maintenance, and fixes effort over time.
Bugs introduced vs. fixed over time.
Reclassifies engineering effort based on bug attribution. Commits that introduced bugs are retrospectively counted as poor investments.
Investment Quality reclassifies engineering effort based on bug attribution data. Commits identified as buggy origins (those that introduced bugs later fixed by someone) have their grow and maintenance time moved into the Wasted Time category. Their waste (fix commits) remains counted as productive. All other commits retain their standard classification: grow is productive, maintenance is maintenance, and waste (fixes) is productive.
The standard model classifies commits as Growth, Maintenance, or Fixes. Investment Quality adds a quality lens: a commit that introduced a bug is retrospectively counted as a poor investment — the engineering time spent on it was wasted because it ultimately required additional fix work. Fix commits (Fixes in the standard model) are reframed as productive, because fixing bugs is valuable work.
Currently computed client-side from commit and bug attribution data. Ideal server-side endpoint:
POST /v1/organizations/{orgId}/investment-quality
Content-Type: application/json
Request:
{
"startTime": "2025-01-01T00:00:00Z",
"endTime": "2025-12-31T23:59:59Z",
"bucketSize": "BUCKET_SIZE_MONTH",
"groupBy": ["repository_id" | "deliverer_email"]
}
Response:
{
"productivePct": 74,
"maintenancePct": 18,
"wastedPct": 8,
"buckets": [
{
"bucketStart": "2025-01-01T00:00:00Z",
"productive": 4.2,
"maintenance": 1.8,
"wasted": 0.6
}
]
}Latest analyzed commits from this developer.
| Hash | Message | Date | Files | Effort |
|---|---|---|---|---|
| 134e52e | Use newest Npgsql when targeting net8.0 and above (#13724) | Mar 31 | 2 | – |
| 5f282a9 | This commit introduces significant **refactoring and architectural improvements** to the **`CollectionModel` and `CollectionModelBuilder` infrastructure** within the **`VectorData.Abstractions`** module. Key changes include replacing virtual property accessors with delegate fields for **performance optimization**, simplifying record creation by using `Func<object>` instead of `IRecordCreator`, and extracting vector embedding resolution logic to reduce duplication. Crucially, it introduces **read-only interfaces** (`IPropertyModel`, `IVectorPropertyModel`, etc.) to provide providers with an **immutable view of the collection model** post-build, which is a **breaking change for experimental provider APIs**. These changes enhance the clarity, immutability, and efficiency of the data model exposed to vector data providers, with minor updates to **`CosmosNoSql`** and **`MongoDB`** connectors. | Mar 26 | 10 | maint |
| 082e28e | This commit **refactors** the **.NET VectorData providers** by **removing all legacy filtering logic** associated with the `OldFilter` property. It standardizes the filtering mechanism across various providers like `AzureAISearch`, `CosmosMongoDB`, and `PgVector`, ensuring all search operations now exclusively use the modern `Filter` property. This **maintenance** effort cleans up the codebase, eliminating obsolete methods such as `BuildLegacyFilter` and `GenerateLegacyFilterWhereClause` from components like `CosmosNoSqlCollectionQueryBuilder` and `PgVector.PostgresSqlBuilder`. The `VectorData.Abstractions` module is also updated to remove the `OldFilter` property, and associated unit tests are adjusted. The change simplifies the API for vector search, improving consistency and reducing technical debt within the **`VectorData` subsystem**, while retaining `FilterClause` for `ITextSearch` compatibility. | Mar 21 | 41 | maint |
| b36a327 | This commit delivers **critical input validation and escaping fixes** across the **Cosmos NoSQL, Redis, and Weaviate vector data providers**. It refactors the `CosmosNoSqlCollectionQueryBuilder` to utilize parameterized queries for `FullTextScore` keywords, preventing injection risks. Additionally, the `RedisFilterTranslator` now includes a `SanitizeStringConstant` method for proper string literal escaping, and `WeaviateQueryBuilder` explicitly escapes backslashes and double quotes in hybrid search keywords. These **bug fixes** significantly enhance the **security and reliability** of query generation, ensuring correct handling of special characters and preventing potential injection vulnerabilities across these data stores. | Mar 4 | 7 | maint |
| e9641a9 | This commit **fixes a critical issue** in the **PostgreSQL, SQL Server, and SQLite vector data providers** where generated SQL `CREATE TABLE` statements failed to correctly apply column nullability constraints. It introduces an `IsNullable` property to the core `PropertyModel` and `SqliteColumn` to accurately track property nullability, including support for Nullable Reference Types (NRTs). The respective property mappers and command builders are updated to leverage this information, ensuring `NOT NULL` constraints are correctly generated in the database schema. This **enhances data integrity** by enforcing intended nullability and is thoroughly validated with new and updated unit and conformance tests across the affected providers. | Mar 4 | 12 | maint |
| 781881a | This commit introduces a **new `EmbeddingGenerationDispatcher` mechanism** to **deduplicate and centralize embedding generation management** across the entire **.NET `VectorData` component**. It involves a significant **refactoring** effort, updating all concrete vector database implementations (e.g., `AzureAISearchCollection`, `CosmosMongoCollection`, `PineconeCollection`, `RedisJsonCollection`, `SqlServerCollection`) to utilize this new dispatcher for methods like `GetVectorSearchValue` and `GetRecordsAndGeneratedEmbeddings`. This change streamlines how embeddings are resolved and generated, improving consistency and maintainability across various data stores. The core `VectorData.Abstractions` module is updated to introduce the dispatcher and modify `VectorPropertyModel` to delegate embedding generation, making the system more robust and extensible for future vector database integrations. | Mar 2 | 24 | maint |
| 74d4310 | This commit introduces a **new capability** to the **.NET vector data integration** with **PostgreSQL**, specifically within the `PgVector` component. It enhances the `CreateCollectionAsync` method to automatically check for and install the `pgvector` extension on the database. This **feature enhancement** simplifies the setup process by handling concurrent installation attempts and ensuring Npgsql types are reloaded correctly, significantly improving the reliability and ease of use for vector data operations. | Feb 27 | 2 | grow |
| f077655 | This commit introduces **robust validation for collection key types** across the **VectorData abstraction layer** and its numerous data provider implementations. It **adds a new capability** to `CollectionModelBuilder.cs` to store and validate the generic key type (`TKey`) against the actual key property type during model building, preventing potential mismatches. This **refactoring** ensures that all supported data providers, including **MongoDB, Azure AI Search, CosmosDB, InMemory, Postgres, Pinecone, Qdrant, Redis, SQL Server, SQLite, and Weaviate**, consistently leverage this validation, with a specific conditional skip for `CosmosNoSqlKey` composite keys. The change prevents runtime errors by throwing an `InvalidOperationException` when a collection is created with a mismatched generic key type, significantly improving type safety and developer experience. | Feb 26 | 38 | maint |
| f59e395 | This commit **enhances the .Net `PgVector` connector** by making the PostgreSQL database schema configuration **optional**. It **refactors** the internal schema handling within the **`PgVector` module**, specifically impacting `PostgresCollection`, `PostgresVectorStore`, and their `Options` classes to support nullable schema properties. Crucially, the **`PostgresSqlBuilder`** was updated to accept nullable schema parameters, defaulting to the 'public' schema for SQL generation when no schema is specified. This **refactoring** provides greater flexibility for users, simplifying configuration for common 'public' schema setups by allowing schema omission. The changes are thoroughly validated with **updated unit tests** for `PostgresSqlBuilder`. | Feb 25 | 6 | maint |
| ea864bf | This commit introduces **comprehensive support** for .NET's `DateTime`, `DateTimeOffset`, `DateOnly`, and `TimeOnly` types across several **vector data providers**. It involves updating data mapping, serialization, and filter translation logic within the **MongoDB**, **Azure AI Search**, **Cosmos NoSQL**, **PgVector**, **Qdrant**, **SqliteVec**, and **Weaviate** connectors. This **new feature** allows users to store, retrieve, and filter data using these temporal types, with specific handling for BSON, ISO 8601 string formatting, and dedicated JSON converters for Weaviate. Model builders and property type validation have been updated to recognize these types as valid data properties. A minor **bug fix** was also included to explicitly disallow `DateTimeOffset` and `DateOnly` in MongoDB vector search pre-filters, ensuring robust data handling across the system. | Feb 25 | 40 | grow |
| c086d09 | This commit introduces **approximate vector search support** for **SQL Server** within the `.Net VectorData` library, enabling the creation and utilization of **DiskANN vector indexes**. The **`SqlServerCommandBuilder`** is significantly updated to generate specific SQL commands for DiskANN index creation and approximate vector search queries using the `VECTOR_SEARCH` function. This **new capability** extends the `SqlServerCollection` to execute these commands and updates `SqlServerModelBuilder` to validate the new index kind. Comprehensive **conformance tests** have been added to validate this functionality, including a temporary workaround for SQL Server 2025 limitations, providing users with a powerful new tool for scalable vector data management. | Feb 23 | 6 | maint |
| 9f2dafc | This commit **removes the `SupportsMultipleKeys` model building option** from the `CollectionModelBuilder` within the **VectorData Abstractions** module. This is a **refactoring and cleanup** effort, as no existing data providers currently utilize this specific option for composite keys, even if they support composite keys in other ways. The change simplifies the `CollectionModelBuilder`'s constructor and `Validate` method by eliminating conditional logic related to `SupportsMultipleKeys`, reducing code complexity without impacting current functionality. | Feb 19 | 17 | maint |
| 8513c2a | This commit delivers a substantial **enhancement** and **refactoring** to the **Cosmos NoSQL provider**, primarily focusing on key management and read/write efficiency. It introduces the `CosmosNoSqlKey` type to support flexible keying, including **hierarchical partition keys** and composite keys, allowing collections to be configured with `string`, `Guid`, or `CosmosNoSqlKey` as their primary identifier. Crucially, operations like `GetAsync`, `UpdateAsync`, and `DeleteAsync` within `CosmosNoSqlCollection` are optimized to leverage **efficient point reads** (`ReadItem` and `ReadManyItemsAsync`) and correctly supply partition keys, significantly improving performance. This work provides greater flexibility in data modeling and boosts the efficiency of data access patterns for applications using the Cosmos NoSQL provider. | Feb 17 | 31 | grow |
| c6f07d6 | This commit **implements SQL Server hybrid search**, introducing a **new capability** to combine vector and keyword search within the **.NET VectorData SQL Server module**. It adds the `IKeywordHybridSearchable` interface and `HybridSearchAsync` to `SqlServerCollection.cs`, while `SqlServerCommandBuilder.cs` gains `SelectHybrid` for RRF-based SQL query generation and enhanced `CreateCollection` for full-text indexing. A **refactoring** in `SqlServerFilterTranslator.cs` supports table aliases crucial for filtering in hybrid search CTEs, and the new functionality is registered via `AddSqlServerVectorStore`. This significantly **enhances the search functionality**, enabling more comprehensive results, and is thoroughly validated with new conformance tests that include full-text index population. | Feb 11 | 6 | grow |
| e8b4862 | This commit **enhances** the **PgVector library's** `.Net` integration by standardizing `DateTime` property mapping to PostgreSQL. It **defaults** `DateTime` values to the `timestamptz` (UTC) type, aligning with Npgsql and EF Core. A new `WithStoreType()` extension method is introduced, providing a **new capability** for developers to **explicitly configure** `DateTime` properties to map to the `timestamp` (non-UTC) type. This change impacts **data type mapping**, **SQL generation**, and **filter translation** within the `PgVector` component, offering greater flexibility and consistency in handling date and time data. | Feb 11 | 11 | maint |
| b0d621f | This commit introduces **hybrid keyword and vector similarity search capabilities** for the **PostgreSQL vector store** within the `.Net VectorData` library. It implements the `IKeywordHybridSearchable` interface in `PostgresCollection.cs` and adds robust SQL generation for full-text GIN indexes and RRF-based hybrid queries via `PostgresSqlBuilder.cs`. Additionally, this change enhances the `VectorData.Abstractions` by introducing `ProviderAnnotations` to `PropertyModel` and `VectorStoreProperty`, enabling **provider-specific configuration** for data properties, such as defining the full-text search language for PostgreSQL. This **new capability** significantly expands the search functionality and configurability of the PostgreSQL vector data provider, allowing for more nuanced and powerful data retrieval. New conformance and unit tests have been added to validate the hybrid search functionality and property mapping. | Feb 5 | 12 | grow |
| 7d7a649 | This commit introduces a **new capability** to the **Managed Extensibility Vector Data (MEVD)** subsystem, allowing users to specify a `score threshold` for vector and hybrid search operations. This feature enables filtering search results to include only matches above a defined relevance score, thereby improving the quality and precision of returned data across various providers. For providers lacking native support, such as **Azure AI Search**, **In-Memory**, and **Pinecone**, the filtering is pragmatically applied client-side, while others like **Cosmos MongoDB**, **PostgreSQL**, and **SQL Server** integrate the threshold directly into their native queries or aggregation pipelines. The `ScoreThreshold` property has been added to `VectorSearchOptions` and `HybridSearchOptions` in `VectorData.Abstractions`, ensuring a consistent API for this enhancement across the diverse **MEVD data providers**. This **feature implementation** significantly enhances the utility of vector search by providing more granular control over result relevance. | Feb 5 | 26 | grow |
| cbb1f21 | This commit introduces a significant **refactoring** within the **VectorData** component by establishing a new abstract class, `FilterTranslatorBase`. This base class **centralizes common filter preprocessing and expression matching logic**, effectively **eliminating substantial code duplication** across various data provider integrations. Multiple existing filter translators, including those for **Azure AI Search, SQL, CosmosDB, MongoDB, Pinecone, Qdrant, and Redis**, have been updated to inherit from `FilterTranslatorBase`, improving **maintainability and consistency**. Additionally, `FilterPreprocessingOptions` is introduced to configure the new preprocessing behavior, enhancing the extensibility of filter translation. | Feb 1 | 11 | maint |
| 7918395 | This commit introduces **key auto-generation** across the **.NET VectorData providers**, allowing primary keys to be automatically generated when a record is saved with a default key value. This **new capability** adds an `IsAutoGenerated` property to key definitions and attributes, enabling both **client-side GUID generation** and **database-side auto-incrementing** for relational providers like PostgreSQL and SQL Server, while still permitting manual key assignment. The change involves extensive **refactoring** of key property validation logic within `CollectionModelBuilder` and specific provider `ModelBuilder` implementations, alongside updates to `UpsertAsync` methods in numerous data connectors. This enhancement standardizes key handling, aligning with Entity Framework's behavior and simplifying data ingestion for higher-level components such as `Microsoft.Extensions.DataIngestion`. | Jan 28 | 49 | grow |
| fe11ab6 | This commit introduces a **new capability** to the **VectorData filter translation** system, enabling support for `Enumerable.Any(x => x.Contains(...))` expressions. This allows users to filter data based on whether any element within a collection field contains a specified substring. The functionality is implemented across various backend providers, including **Azure AI Search, Cosmos NoSQL, Qdrant, Redis, Weaviate, PostgreSQL, and SQL Server**, enhancing the expressiveness of queries. Conformance tests were updated to validate this new feature and explicitly mark providers like MongoDB, Pinecone, and SQLite where this specific filter pattern is not yet supported. | Jan 27 | 14 | grow |
Use newest Npgsql when targeting net8.0 and above (#13724)
This commit introduces significant **refactoring and architectural improvements** to the **`CollectionModel` and `CollectionModelBuilder` infrastructure** within the **`VectorData.Abstractions`** module. Key changes include replacing virtual property accessors with delegate fields for **performance optimization**, simplifying record creation by using `Func<object>` instead of `IRecordCreator`, and extracting vector embedding resolution logic to reduce duplication. Crucially, it introduces **read-only interfaces** (`IPropertyModel`, `IVectorPropertyModel`, etc.) to provide providers with an **immutable view of the collection model** post-build, which is a **breaking change for experimental provider APIs**. These changes enhance the clarity, immutability, and efficiency of the data model exposed to vector data providers, with minor updates to **`CosmosNoSql`** and **`MongoDB`** connectors.
This commit **refactors** the **.NET VectorData providers** by **removing all legacy filtering logic** associated with the `OldFilter` property. It standardizes the filtering mechanism across various providers like `AzureAISearch`, `CosmosMongoDB`, and `PgVector`, ensuring all search operations now exclusively use the modern `Filter` property. This **maintenance** effort cleans up the codebase, eliminating obsolete methods such as `BuildLegacyFilter` and `GenerateLegacyFilterWhereClause` from components like `CosmosNoSqlCollectionQueryBuilder` and `PgVector.PostgresSqlBuilder`. The `VectorData.Abstractions` module is also updated to remove the `OldFilter` property, and associated unit tests are adjusted. The change simplifies the API for vector search, improving consistency and reducing technical debt within the **`VectorData` subsystem**, while retaining `FilterClause` for `ITextSearch` compatibility.
This commit delivers **critical input validation and escaping fixes** across the **Cosmos NoSQL, Redis, and Weaviate vector data providers**. It refactors the `CosmosNoSqlCollectionQueryBuilder` to utilize parameterized queries for `FullTextScore` keywords, preventing injection risks. Additionally, the `RedisFilterTranslator` now includes a `SanitizeStringConstant` method for proper string literal escaping, and `WeaviateQueryBuilder` explicitly escapes backslashes and double quotes in hybrid search keywords. These **bug fixes** significantly enhance the **security and reliability** of query generation, ensuring correct handling of special characters and preventing potential injection vulnerabilities across these data stores.
This commit **fixes a critical issue** in the **PostgreSQL, SQL Server, and SQLite vector data providers** where generated SQL `CREATE TABLE` statements failed to correctly apply column nullability constraints. It introduces an `IsNullable` property to the core `PropertyModel` and `SqliteColumn` to accurately track property nullability, including support for Nullable Reference Types (NRTs). The respective property mappers and command builders are updated to leverage this information, ensuring `NOT NULL` constraints are correctly generated in the database schema. This **enhances data integrity** by enforcing intended nullability and is thoroughly validated with new and updated unit and conformance tests across the affected providers.
This commit introduces a **new `EmbeddingGenerationDispatcher` mechanism** to **deduplicate and centralize embedding generation management** across the entire **.NET `VectorData` component**. It involves a significant **refactoring** effort, updating all concrete vector database implementations (e.g., `AzureAISearchCollection`, `CosmosMongoCollection`, `PineconeCollection`, `RedisJsonCollection`, `SqlServerCollection`) to utilize this new dispatcher for methods like `GetVectorSearchValue` and `GetRecordsAndGeneratedEmbeddings`. This change streamlines how embeddings are resolved and generated, improving consistency and maintainability across various data stores. The core `VectorData.Abstractions` module is updated to introduce the dispatcher and modify `VectorPropertyModel` to delegate embedding generation, making the system more robust and extensible for future vector database integrations.
This commit introduces a **new capability** to the **.NET vector data integration** with **PostgreSQL**, specifically within the `PgVector` component. It enhances the `CreateCollectionAsync` method to automatically check for and install the `pgvector` extension on the database. This **feature enhancement** simplifies the setup process by handling concurrent installation attempts and ensuring Npgsql types are reloaded correctly, significantly improving the reliability and ease of use for vector data operations.
This commit introduces **robust validation for collection key types** across the **VectorData abstraction layer** and its numerous data provider implementations. It **adds a new capability** to `CollectionModelBuilder.cs` to store and validate the generic key type (`TKey`) against the actual key property type during model building, preventing potential mismatches. This **refactoring** ensures that all supported data providers, including **MongoDB, Azure AI Search, CosmosDB, InMemory, Postgres, Pinecone, Qdrant, Redis, SQL Server, SQLite, and Weaviate**, consistently leverage this validation, with a specific conditional skip for `CosmosNoSqlKey` composite keys. The change prevents runtime errors by throwing an `InvalidOperationException` when a collection is created with a mismatched generic key type, significantly improving type safety and developer experience.
This commit **enhances the .Net `PgVector` connector** by making the PostgreSQL database schema configuration **optional**. It **refactors** the internal schema handling within the **`PgVector` module**, specifically impacting `PostgresCollection`, `PostgresVectorStore`, and their `Options` classes to support nullable schema properties. Crucially, the **`PostgresSqlBuilder`** was updated to accept nullable schema parameters, defaulting to the 'public' schema for SQL generation when no schema is specified. This **refactoring** provides greater flexibility for users, simplifying configuration for common 'public' schema setups by allowing schema omission. The changes are thoroughly validated with **updated unit tests** for `PostgresSqlBuilder`.
This commit introduces **comprehensive support** for .NET's `DateTime`, `DateTimeOffset`, `DateOnly`, and `TimeOnly` types across several **vector data providers**. It involves updating data mapping, serialization, and filter translation logic within the **MongoDB**, **Azure AI Search**, **Cosmos NoSQL**, **PgVector**, **Qdrant**, **SqliteVec**, and **Weaviate** connectors. This **new feature** allows users to store, retrieve, and filter data using these temporal types, with specific handling for BSON, ISO 8601 string formatting, and dedicated JSON converters for Weaviate. Model builders and property type validation have been updated to recognize these types as valid data properties. A minor **bug fix** was also included to explicitly disallow `DateTimeOffset` and `DateOnly` in MongoDB vector search pre-filters, ensuring robust data handling across the system.
This commit introduces **approximate vector search support** for **SQL Server** within the `.Net VectorData` library, enabling the creation and utilization of **DiskANN vector indexes**. The **`SqlServerCommandBuilder`** is significantly updated to generate specific SQL commands for DiskANN index creation and approximate vector search queries using the `VECTOR_SEARCH` function. This **new capability** extends the `SqlServerCollection` to execute these commands and updates `SqlServerModelBuilder` to validate the new index kind. Comprehensive **conformance tests** have been added to validate this functionality, including a temporary workaround for SQL Server 2025 limitations, providing users with a powerful new tool for scalable vector data management.
This commit **removes the `SupportsMultipleKeys` model building option** from the `CollectionModelBuilder` within the **VectorData Abstractions** module. This is a **refactoring and cleanup** effort, as no existing data providers currently utilize this specific option for composite keys, even if they support composite keys in other ways. The change simplifies the `CollectionModelBuilder`'s constructor and `Validate` method by eliminating conditional logic related to `SupportsMultipleKeys`, reducing code complexity without impacting current functionality.
This commit delivers a substantial **enhancement** and **refactoring** to the **Cosmos NoSQL provider**, primarily focusing on key management and read/write efficiency. It introduces the `CosmosNoSqlKey` type to support flexible keying, including **hierarchical partition keys** and composite keys, allowing collections to be configured with `string`, `Guid`, or `CosmosNoSqlKey` as their primary identifier. Crucially, operations like `GetAsync`, `UpdateAsync`, and `DeleteAsync` within `CosmosNoSqlCollection` are optimized to leverage **efficient point reads** (`ReadItem` and `ReadManyItemsAsync`) and correctly supply partition keys, significantly improving performance. This work provides greater flexibility in data modeling and boosts the efficiency of data access patterns for applications using the Cosmos NoSQL provider.
This commit **implements SQL Server hybrid search**, introducing a **new capability** to combine vector and keyword search within the **.NET VectorData SQL Server module**. It adds the `IKeywordHybridSearchable` interface and `HybridSearchAsync` to `SqlServerCollection.cs`, while `SqlServerCommandBuilder.cs` gains `SelectHybrid` for RRF-based SQL query generation and enhanced `CreateCollection` for full-text indexing. A **refactoring** in `SqlServerFilterTranslator.cs` supports table aliases crucial for filtering in hybrid search CTEs, and the new functionality is registered via `AddSqlServerVectorStore`. This significantly **enhances the search functionality**, enabling more comprehensive results, and is thoroughly validated with new conformance tests that include full-text index population.
This commit **enhances** the **PgVector library's** `.Net` integration by standardizing `DateTime` property mapping to PostgreSQL. It **defaults** `DateTime` values to the `timestamptz` (UTC) type, aligning with Npgsql and EF Core. A new `WithStoreType()` extension method is introduced, providing a **new capability** for developers to **explicitly configure** `DateTime` properties to map to the `timestamp` (non-UTC) type. This change impacts **data type mapping**, **SQL generation**, and **filter translation** within the `PgVector` component, offering greater flexibility and consistency in handling date and time data.
This commit introduces **hybrid keyword and vector similarity search capabilities** for the **PostgreSQL vector store** within the `.Net VectorData` library. It implements the `IKeywordHybridSearchable` interface in `PostgresCollection.cs` and adds robust SQL generation for full-text GIN indexes and RRF-based hybrid queries via `PostgresSqlBuilder.cs`. Additionally, this change enhances the `VectorData.Abstractions` by introducing `ProviderAnnotations` to `PropertyModel` and `VectorStoreProperty`, enabling **provider-specific configuration** for data properties, such as defining the full-text search language for PostgreSQL. This **new capability** significantly expands the search functionality and configurability of the PostgreSQL vector data provider, allowing for more nuanced and powerful data retrieval. New conformance and unit tests have been added to validate the hybrid search functionality and property mapping.
This commit introduces a **new capability** to the **Managed Extensibility Vector Data (MEVD)** subsystem, allowing users to specify a `score threshold` for vector and hybrid search operations. This feature enables filtering search results to include only matches above a defined relevance score, thereby improving the quality and precision of returned data across various providers. For providers lacking native support, such as **Azure AI Search**, **In-Memory**, and **Pinecone**, the filtering is pragmatically applied client-side, while others like **Cosmos MongoDB**, **PostgreSQL**, and **SQL Server** integrate the threshold directly into their native queries or aggregation pipelines. The `ScoreThreshold` property has been added to `VectorSearchOptions` and `HybridSearchOptions` in `VectorData.Abstractions`, ensuring a consistent API for this enhancement across the diverse **MEVD data providers**. This **feature implementation** significantly enhances the utility of vector search by providing more granular control over result relevance.
This commit introduces a significant **refactoring** within the **VectorData** component by establishing a new abstract class, `FilterTranslatorBase`. This base class **centralizes common filter preprocessing and expression matching logic**, effectively **eliminating substantial code duplication** across various data provider integrations. Multiple existing filter translators, including those for **Azure AI Search, SQL, CosmosDB, MongoDB, Pinecone, Qdrant, and Redis**, have been updated to inherit from `FilterTranslatorBase`, improving **maintainability and consistency**. Additionally, `FilterPreprocessingOptions` is introduced to configure the new preprocessing behavior, enhancing the extensibility of filter translation.
This commit introduces **key auto-generation** across the **.NET VectorData providers**, allowing primary keys to be automatically generated when a record is saved with a default key value. This **new capability** adds an `IsAutoGenerated` property to key definitions and attributes, enabling both **client-side GUID generation** and **database-side auto-incrementing** for relational providers like PostgreSQL and SQL Server, while still permitting manual key assignment. The change involves extensive **refactoring** of key property validation logic within `CollectionModelBuilder` and specific provider `ModelBuilder` implementations, alongside updates to `UpsertAsync` methods in numerous data connectors. This enhancement standardizes key handling, aligning with Entity Framework's behavior and simplifying data ingestion for higher-level components such as `Microsoft.Extensions.DataIngestion`.
This commit introduces a **new capability** to the **VectorData filter translation** system, enabling support for `Enumerable.Any(x => x.Contains(...))` expressions. This allows users to filter data based on whether any element within a collection field contains a specified substring. The functionality is implemented across various backend providers, including **Azure AI Search, Cosmos NoSQL, Qdrant, Redis, Weaviate, PostgreSQL, and SQL Server**, enhancing the expressiveness of queries. Conformance tests were updated to validate this new feature and explicitly mark providers like MongoDB, Pinecone, and SQLite where this specific filter pattern is not yet supported.
Commit activity distribution by hour and day of week. Shows when this developer is most active.
Developers who frequently work on the same files and symbols. Higher score means stronger code collaboration.