NavigaraNavigara
OrganizationsDistributionCompareResearch
NavigaraNavigara
OrganizationsDistributionCompareResearch
All developers

Nicolas De Carli

Developer

Nicolas De Carli

ndecarli@meta.com

40 commits~3 files/commit

Performance

YoY:+650%
2026Previous year

Insights

Key patterns and highlights from this developer's activity.

Peak MonthJun'25114 performance
Growth Trend↑121%vs prior period
Avg Files/Commit3files per commit
Active Days29of 455 days
Top Repofolly29 commits

Effort Over Time

Breakdown of growth, maintenance, and fixes effort over time.

Bug Behavior

Beta

Bugs introduced vs. fixed over time.

Investment Quality

Beta

Reclassifies engineering effort based on bug attribution. Commits that introduced bugs are retrospectively counted as poor investments.

64%Productive TimeGrowth 99% + Fixes 1%
34%Maintenance Time
2%Wasted Time
How it works

Methodology

Investment Quality reclassifies engineering effort based on bug attribution data. Commits identified as buggy origins (those that introduced bugs later fixed by someone) have their grow and maintenance time moved into the Wasted Time category. Their waste (fix commits) remains counted as productive. All other commits retain their standard classification: grow is productive, maintenance is maintenance, and waste (fixes) is productive.

Relationship to Growth / Maintenance / Fixes

The standard model classifies commits as Growth, Maintenance, or Fixes. Investment Quality adds a quality lens: a commit that introduced a bug is retrospectively counted as a poor investment — the engineering time spent on it was wasted because it ultimately required additional fix work. Fix commits (Fixes in the standard model) are reframed as productive, because fixing bugs is valuable work.

Proposed API Endpoint

Currently computed client-side from commit and bug attribution data. Ideal server-side endpoint:

POST /v1/organizations/{orgId}/investment-quality
Content-Type: application/json

Request:
{
  "startTime": "2025-01-01T00:00:00Z",
  "endTime": "2025-12-31T23:59:59Z",
  "bucketSize": "BUCKET_SIZE_MONTH",
  "groupBy": ["repository_id" | "deliverer_email"]
}

Response:
{
  "productivePct": 74,
  "maintenancePct": 18,
  "wastedPct": 8,
  "buckets": [
    {
      "bucketStart": "2025-01-01T00:00:00Z",
      "productive": 4.2,
      "maintenance": 1.8,
      "wasted": 0.6
    }
  ]
}

Recent Activity

Latest analyzed commits from this developer.

HashMessageDateFilesEffort
5363868This commit introduces a **performance optimization** for **AARCH64 CRC32C calculations** by adding a specialized routine, `neon_eor3_crc32c_small`, designed for short input sizes. The `folly::hash::crc32c` function is updated to dispatch to this new, faster implementation when processing smaller data blocks on AARCH64 platforms. This **enhancement** significantly improves the speed of CRC32C computations for inputs up to 1KB, leading to more efficient checksum generation.Mar 83grow
ddf751aThis commit introduces a **performance optimization** for the **CRC32C calculation** on **AARCH64 (ARM64) architectures**. It **refactors** the existing `neon_eor3_crc32c` implementation within `folly/external/fast-crc32`, renaming it to `neon_eor3_crc32c_v8s2x4e_s2x1` and rewriting its internal logic to better suit server-class CPUs. This **enhancement** significantly improves the speed of CRC32C computations, as demonstrated by benchmark results showing notable gains across various data sizes. The `folly/hash/Checksum` module and its associated build configurations are updated to leverage this faster implementation, providing a direct benefit to applications performing CRC32C checks on AARCH64 systems.Mar 711maint
4195b11This commit introduces a **performance optimization** for **AARCH64 CRC32 calculations** by adding a specialized routine, `neon_eor3_crc32_small`, designed for short inputs. The **`folly::hash::Checksum`** module's `crc32` function now conditionally dispatches to this new, faster implementation for data sizes under 1536 bytes. This **enhancement** significantly **improves throughput** for small data blocks on AARCH64 platforms, with performance gains ranging from 1% to 57% depending on input size.Mar 73grow
9951f94This commit introduces a **new, highly optimized CRC32 implementation** specifically for **AArch64 server-class CPUs**, leveraging NEON EOR3 and SHA3 instructions. This **performance optimization** significantly improves the throughput of CRC32 calculations within the **`folly/hash` module**, with reported gains of 14% to 25%. The `folly/hash/Checksum.cpp` dispatch logic is updated to dynamically select and utilize this faster algorithm when supported by the hardware. This enhancement provides a substantial boost to data integrity check efficiency on compatible AArch64 platforms.Mar 79grow
12fa4cbThis commit **enhances the performance of `folly::ConcurrentHashMap`** by integrating **ARM Scalable Vector Extension (SVE) intrinsics**. Specifically, it adds SVE support to the internal `tagMatchIter` function, allowing for more efficient tag filtering using the `MATCH` instruction. This **optimization** improves the speed of `find()` operations within the hash map, resulting in a measurable reduction in average lookup times. The change primarily affects the **`folly::concurrency`** module, providing a **performance boost** for applications running on ARM architectures with SVE capabilities.Mar 21grow
5915f9dThis commit **refactors** the `memset` selection logic within the **folly library** for **AArch64 architectures**. It moves the Zero-on-Virtual-Address (ZVA) size check from being performed at each `memset` call to a single check at **load time**. This **performance optimization** is achieved by integrating an inline assembly instruction directly into the C++ code, specifically affecting the `__folly_detail_memset_resolve` symbol in `folly/memset_select_aarch64.cpp`. The change aims to reduce overhead and **improve the efficiency** of `memset` operations by eliminating redundant checks.Feb 252maint
bdfb33eThis commit introduces a **new, highly optimized `memset` implementation** for **AArch64 platforms** by leveraging **ARM's Scalable Vector Extension (SVE)**. A new assembly file, `folly/external/aor/memset-sve.S`, provides the SVE-specific logic, which is then integrated into **Folly's low-level memory utilities** via `folly/memset_select_aarch64.cpp` to dynamically select this version when SVE hardware is detected. This **performance optimization** significantly improves `memset` throughput, particularly for **small input sizes**, as demonstrated by benchmark results. The change enhances **Folly's core memory operations** on compatible ARM processors, providing a faster `memset` for applications running on SVE-enabled systems.Feb 254grow
162a55bThis commit **optimizes performance** for the **`folly::F14Table`** container on **AArch64 architectures**. It **refactors** the internal `occupiedIter` method within `folly/container/detail/F14Table.h` to utilize `SparseMaskIter` instead of `DenseMaskIter`. This change leverages the improved simplicity and efficiency of `SparseMaskIter` after a previous update, resulting in approximately **10% faster execution** for `f14Node`'s `CopyCtor`, `Destructor`, and `Clear` operations.Feb 241maint
b26baa2This commit introduces a **performance optimization** within the **`folly/container/detail/F14Mask.h`** component, specifically for the `SparseMaskIter::next()` method. By adding an `assume` clause, the compiler is now informed that a specific index variable `i` will always be a multiple of 4. This allows the Aarch64 compiler to **eliminate a redundant `AND` instruction** from the generated assembly code. The removal of this pipelined instruction **reduces execution latency by 1 cycle**, thereby improving the **speed of successful finds** on Aarch64 architectures.Feb 241maint
87e9753This commit introduces a **performance optimization** for the **SparseMaskIter** within `folly/container/detail/F14Mask.h`, specifically targeting **AArch64** platforms. It addresses the lack of a direct CTZ (Count Trailing Zeros) instruction on armv9a by implementing an efficient sequence of `RBIT` (Reverse Bits) followed by `CLZ` (Count Leading Zeros). The change modifies the `SparseMaskIter`'s internal logic and adds a new helper function, `findLastSetNonZero`, to improve instruction scheduling and reduce speculative execution in the iteration loop. This **AArch64-specific improvement** is expected to benefit operations like `occupiedIter` by providing a more efficient way to find the next set bit.Feb 241maint
e978c48This commit **enhances the performance** of the `bitReverse` function within the **`folly::lang::Bits` module** by leveraging **Clang's compiler builtins** for efficient bit reversal operations. The original implementation has been refactored into a `bitReverseFallback` function, ensuring continued functionality for environments without builtin support. This change is a **performance optimization** and **refactoring** that improves the speed of bit manipulation. New test cases were added to `folly/lang/test/BitsTest.cpp` to verify the correctness of the `bitReverseFallback` implementation. This provides a significant **performance boost** for bit reversal operations across the `folly` library.Feb 242maint
0d02286This commit **optimizes** the **Folly F14Table**'s tag matching logic, specifically within the `tagMatchIter` function in `folly/container/detail/F14Table.h`. It **refactors** the underlying **ARM SVE** instruction sequence by replacing a `cmeq` instruction with a `mov` instruction. This change aims to improve the performance of `find` operations, particularly when matches are not found in early iterations, by reducing speculative execution overhead. Although benchmarks did not show a significant difference, this modification is theoretically more efficient according to ARM experts, enhancing the data structure's search efficiency.Feb 241maint
505140cThis commit introduces a **performance optimization** for **Folly's F14Table container** on **AArch64 architectures**. It modifies the internal tag matching logic within `find` operations, specifically in methods like `find`, `find_if`, and `tagMatchIter`, to utilize the ARM SVE `MATCH` instruction. This change allows for quicker branching when searching for elements, resulting in a **~10% reduction in `find` latency**. This **architectural-specific improvement** enhances the efficiency of `F14Table` for applications running on AArch64.Feb 231grow
4055edfThis commit provides a **bug fix** to resolve **OSS build breaks** encountered when compiling **`folly::ConcurrentHashMap`** on **AARCH64** without targeting the CRC feature set. It adjusts preprocessor conditions within `folly/concurrency/ConcurrentHashMap.h` and `folly/concurrency/detail/ConcurrentHashMap-detail.h` to ensure that ARM intrinsics and SIMD features are only enabled when `FOLLY_F14_CRC_INTRINSIC_AVAILABLE` is true for `AARCH64`. This prevents incorrect inclusion of CRC-dependent SIMD optimizations, thereby eliminating compilation failures and ensuring the robust build and correct functionality of `ConcurrentHashMap`'s optimized paths in diverse build environments.Sep 44waste
e9d2b0eThis commit introduces **NEON intrinsics** to the **`folly::RWSpinLock`** implementation, specifically targeting **AARCH64 platforms**. This **performance enhancement** optimizes the internal lock and unlock operations, building upon existing SSE optimizations for other architectures. The change significantly **reduces synchronization overhead** for applications utilizing `RWSpinLock` on ARM-based systems, as demonstrated by improved benchmark results across various thread counts. New benchmark cases were also added to `folly/synchronization/test/SmallLocksBenchmark.cpp` to validate these optimizations.Aug 143maint
30c7cc4This commit implements a **maintenance fix** to address **test timeouts** occurring in the **Thrift C++2 protocol unit tests**. Specifically, it **reduces the number of iterations** within the `runBigListTest` function's loop in `thrift/lib/cpp2/protocol/test/ProtocolTest.cpp` and adjusts the constant used for `intListSize` calculation. This change partly rolls back previous modifications that increased test complexity, thereby **improving test stability and reliability**, especially on builds without compiler optimizations.Jul 91maint
5525a6eThis commit provides a **build fix** for the **`folly::concurrency::ConcurrentHashMap`** module, specifically within its `SIMDTable` implementation. It addresses a compilation issue by adding an explicit cast to `vreinterpretq_u8_u64` in `folly/concurrency/detail/ConcurrentHashMap-detail.h`. This change resolves a **build error** that was occurring on some platforms, ensuring successful compilation and preventing build failures for users of the `ConcurrentHashMap`.Jul 21waste
e3e41cfThis commit **implements and enables** `ConcurrentHashMapSIMD` for **AARCH64 architectures**, extending the `folly::concurrency` module with **NEON-based SIMD intrinsics** for tag filtering and hash splitting in `folly/concurrency/detail/ConcurrentHashMap-detail.h`. This **new capability** allows AARCH64 builds to utilize the vectorized hash map, which previously defaulted to the non-SIMD version. Consequently, applications on AARCH64 can now leverage `ConcurrentHashMapSIMD` for potential **performance improvements**, with updated test suites in `folly/concurrency/test/ConcurrentHashMapTest.cpp` and `folly/concurrency/test/ConcurrentHashMapStressTest.cpp` confirming this enablement.Jul 24grow
2bf0c97This commit introduces a significant **performance optimization** for **Folly's F14 containers** by switching their string hashing algorithm to `rapidhashNano`. Specifically, the `TransparentRangeHash` utility, used for heterogeneous lookups, now leverages this more efficient hash function, as seen in `folly/container/HeterogeneousAccess.h`. This change results in substantial **CPU usage reduction** and **hashing time improvements** across various services like AdRanker and AdFinder, with benchmarks showing gains of up to 75% on both AMD64 and aarch64 architectures. The work also includes minor build configuration updates and a refactor in `folly/portability/Constexpr.h`.Jun 266grow
2242ff6This commit **refactors** the **`rapidhash`** library by updating its data loading mechanisms. Specifically, the `rapidhash_read32` and `rapidhash_read64` functions within `folly/external/rapidhash/rapidhash.h` now utilize the new `folly::constexprLoadUnaligned` utility. This **maintenance** change replaces previously custom-defined `constexpr` unaligned loading functions, standardizing the approach to unaligned data access within `rapidhash` and improving code consistency across the `folly` project.Jun 131maint
5363868Mar 8

This commit introduces a **performance optimization** for **AARCH64 CRC32C calculations** by adding a specialized routine, `neon_eor3_crc32c_small`, designed for short input sizes. The `folly::hash::crc32c` function is updated to dispatch to this new, faster implementation when processing smaller data blocks on AARCH64 platforms. This **enhancement** significantly improves the speed of CRC32C computations for inputs up to 1KB, leading to more efficient checksum generation.

3 filesgrow
ddf751aMar 7

This commit introduces a **performance optimization** for the **CRC32C calculation** on **AARCH64 (ARM64) architectures**. It **refactors** the existing `neon_eor3_crc32c` implementation within `folly/external/fast-crc32`, renaming it to `neon_eor3_crc32c_v8s2x4e_s2x1` and rewriting its internal logic to better suit server-class CPUs. This **enhancement** significantly improves the speed of CRC32C computations, as demonstrated by benchmark results showing notable gains across various data sizes. The `folly/hash/Checksum` module and its associated build configurations are updated to leverage this faster implementation, providing a direct benefit to applications performing CRC32C checks on AARCH64 systems.

11 filesmaint
4195b11Mar 7

This commit introduces a **performance optimization** for **AARCH64 CRC32 calculations** by adding a specialized routine, `neon_eor3_crc32_small`, designed for short inputs. The **`folly::hash::Checksum`** module's `crc32` function now conditionally dispatches to this new, faster implementation for data sizes under 1536 bytes. This **enhancement** significantly **improves throughput** for small data blocks on AARCH64 platforms, with performance gains ranging from 1% to 57% depending on input size.

3 filesgrow
9951f94Mar 7

This commit introduces a **new, highly optimized CRC32 implementation** specifically for **AArch64 server-class CPUs**, leveraging NEON EOR3 and SHA3 instructions. This **performance optimization** significantly improves the throughput of CRC32 calculations within the **`folly/hash` module**, with reported gains of 14% to 25%. The `folly/hash/Checksum.cpp` dispatch logic is updated to dynamically select and utilize this faster algorithm when supported by the hardware. This enhancement provides a substantial boost to data integrity check efficiency on compatible AArch64 platforms.

9 filesgrow
12fa4cbMar 2

This commit **enhances the performance of `folly::ConcurrentHashMap`** by integrating **ARM Scalable Vector Extension (SVE) intrinsics**. Specifically, it adds SVE support to the internal `tagMatchIter` function, allowing for more efficient tag filtering using the `MATCH` instruction. This **optimization** improves the speed of `find()` operations within the hash map, resulting in a measurable reduction in average lookup times. The change primarily affects the **`folly::concurrency`** module, providing a **performance boost** for applications running on ARM architectures with SVE capabilities.

1 filesgrow
5915f9dFeb 25

This commit **refactors** the `memset` selection logic within the **folly library** for **AArch64 architectures**. It moves the Zero-on-Virtual-Address (ZVA) size check from being performed at each `memset` call to a single check at **load time**. This **performance optimization** is achieved by integrating an inline assembly instruction directly into the C++ code, specifically affecting the `__folly_detail_memset_resolve` symbol in `folly/memset_select_aarch64.cpp`. The change aims to reduce overhead and **improve the efficiency** of `memset` operations by eliminating redundant checks.

2 filesmaint
bdfb33eFeb 25

This commit introduces a **new, highly optimized `memset` implementation** for **AArch64 platforms** by leveraging **ARM's Scalable Vector Extension (SVE)**. A new assembly file, `folly/external/aor/memset-sve.S`, provides the SVE-specific logic, which is then integrated into **Folly's low-level memory utilities** via `folly/memset_select_aarch64.cpp` to dynamically select this version when SVE hardware is detected. This **performance optimization** significantly improves `memset` throughput, particularly for **small input sizes**, as demonstrated by benchmark results. The change enhances **Folly's core memory operations** on compatible ARM processors, providing a faster `memset` for applications running on SVE-enabled systems.

4 filesgrow
162a55bFeb 24

This commit **optimizes performance** for the **`folly::F14Table`** container on **AArch64 architectures**. It **refactors** the internal `occupiedIter` method within `folly/container/detail/F14Table.h` to utilize `SparseMaskIter` instead of `DenseMaskIter`. This change leverages the improved simplicity and efficiency of `SparseMaskIter` after a previous update, resulting in approximately **10% faster execution** for `f14Node`'s `CopyCtor`, `Destructor`, and `Clear` operations.

1 filesmaint
b26baa2Feb 24

This commit introduces a **performance optimization** within the **`folly/container/detail/F14Mask.h`** component, specifically for the `SparseMaskIter::next()` method. By adding an `assume` clause, the compiler is now informed that a specific index variable `i` will always be a multiple of 4. This allows the Aarch64 compiler to **eliminate a redundant `AND` instruction** from the generated assembly code. The removal of this pipelined instruction **reduces execution latency by 1 cycle**, thereby improving the **speed of successful finds** on Aarch64 architectures.

1 filesmaint
87e9753Feb 24

This commit introduces a **performance optimization** for the **SparseMaskIter** within `folly/container/detail/F14Mask.h`, specifically targeting **AArch64** platforms. It addresses the lack of a direct CTZ (Count Trailing Zeros) instruction on armv9a by implementing an efficient sequence of `RBIT` (Reverse Bits) followed by `CLZ` (Count Leading Zeros). The change modifies the `SparseMaskIter`'s internal logic and adds a new helper function, `findLastSetNonZero`, to improve instruction scheduling and reduce speculative execution in the iteration loop. This **AArch64-specific improvement** is expected to benefit operations like `occupiedIter` by providing a more efficient way to find the next set bit.

1 filesmaint
e978c48Feb 24

This commit **enhances the performance** of the `bitReverse` function within the **`folly::lang::Bits` module** by leveraging **Clang's compiler builtins** for efficient bit reversal operations. The original implementation has been refactored into a `bitReverseFallback` function, ensuring continued functionality for environments without builtin support. This change is a **performance optimization** and **refactoring** that improves the speed of bit manipulation. New test cases were added to `folly/lang/test/BitsTest.cpp` to verify the correctness of the `bitReverseFallback` implementation. This provides a significant **performance boost** for bit reversal operations across the `folly` library.

2 filesmaint
0d02286Feb 24

This commit **optimizes** the **Folly F14Table**'s tag matching logic, specifically within the `tagMatchIter` function in `folly/container/detail/F14Table.h`. It **refactors** the underlying **ARM SVE** instruction sequence by replacing a `cmeq` instruction with a `mov` instruction. This change aims to improve the performance of `find` operations, particularly when matches are not found in early iterations, by reducing speculative execution overhead. Although benchmarks did not show a significant difference, this modification is theoretically more efficient according to ARM experts, enhancing the data structure's search efficiency.

1 filesmaint
505140cFeb 23

This commit introduces a **performance optimization** for **Folly's F14Table container** on **AArch64 architectures**. It modifies the internal tag matching logic within `find` operations, specifically in methods like `find`, `find_if`, and `tagMatchIter`, to utilize the ARM SVE `MATCH` instruction. This change allows for quicker branching when searching for elements, resulting in a **~10% reduction in `find` latency**. This **architectural-specific improvement** enhances the efficiency of `F14Table` for applications running on AArch64.

1 filesgrow
4055edfSep 4

This commit provides a **bug fix** to resolve **OSS build breaks** encountered when compiling **`folly::ConcurrentHashMap`** on **AARCH64** without targeting the CRC feature set. It adjusts preprocessor conditions within `folly/concurrency/ConcurrentHashMap.h` and `folly/concurrency/detail/ConcurrentHashMap-detail.h` to ensure that ARM intrinsics and SIMD features are only enabled when `FOLLY_F14_CRC_INTRINSIC_AVAILABLE` is true for `AARCH64`. This prevents incorrect inclusion of CRC-dependent SIMD optimizations, thereby eliminating compilation failures and ensuring the robust build and correct functionality of `ConcurrentHashMap`'s optimized paths in diverse build environments.

4 fileswaste
e9d2b0eAug 14

This commit introduces **NEON intrinsics** to the **`folly::RWSpinLock`** implementation, specifically targeting **AARCH64 platforms**. This **performance enhancement** optimizes the internal lock and unlock operations, building upon existing SSE optimizations for other architectures. The change significantly **reduces synchronization overhead** for applications utilizing `RWSpinLock` on ARM-based systems, as demonstrated by improved benchmark results across various thread counts. New benchmark cases were also added to `folly/synchronization/test/SmallLocksBenchmark.cpp` to validate these optimizations.

3 filesmaint
30c7cc4Jul 9

This commit implements a **maintenance fix** to address **test timeouts** occurring in the **Thrift C++2 protocol unit tests**. Specifically, it **reduces the number of iterations** within the `runBigListTest` function's loop in `thrift/lib/cpp2/protocol/test/ProtocolTest.cpp` and adjusts the constant used for `intListSize` calculation. This change partly rolls back previous modifications that increased test complexity, thereby **improving test stability and reliability**, especially on builds without compiler optimizations.

1 filesmaint
5525a6eJul 2

This commit provides a **build fix** for the **`folly::concurrency::ConcurrentHashMap`** module, specifically within its `SIMDTable` implementation. It addresses a compilation issue by adding an explicit cast to `vreinterpretq_u8_u64` in `folly/concurrency/detail/ConcurrentHashMap-detail.h`. This change resolves a **build error** that was occurring on some platforms, ensuring successful compilation and preventing build failures for users of the `ConcurrentHashMap`.

1 fileswaste
e3e41cfJul 2

This commit **implements and enables** `ConcurrentHashMapSIMD` for **AARCH64 architectures**, extending the `folly::concurrency` module with **NEON-based SIMD intrinsics** for tag filtering and hash splitting in `folly/concurrency/detail/ConcurrentHashMap-detail.h`. This **new capability** allows AARCH64 builds to utilize the vectorized hash map, which previously defaulted to the non-SIMD version. Consequently, applications on AARCH64 can now leverage `ConcurrentHashMapSIMD` for potential **performance improvements**, with updated test suites in `folly/concurrency/test/ConcurrentHashMapTest.cpp` and `folly/concurrency/test/ConcurrentHashMapStressTest.cpp` confirming this enablement.

4 filesgrow
2bf0c97Jun 26

This commit introduces a significant **performance optimization** for **Folly's F14 containers** by switching their string hashing algorithm to `rapidhashNano`. Specifically, the `TransparentRangeHash` utility, used for heterogeneous lookups, now leverages this more efficient hash function, as seen in `folly/container/HeterogeneousAccess.h`. This change results in substantial **CPU usage reduction** and **hashing time improvements** across various services like AdRanker and AdFinder, with benchmarks showing gains of up to 75% on both AMD64 and aarch64 architectures. The work also includes minor build configuration updates and a refactor in `folly/portability/Constexpr.h`.

6 filesgrow
2242ff6Jun 13

This commit **refactors** the **`rapidhash`** library by updating its data loading mechanisms. Specifically, the `rapidhash_read32` and `rapidhash_read64` functions within `folly/external/rapidhash/rapidhash.h` now utilize the new `folly::constexprLoadUnaligned` utility. This **maintenance** change replaces previously custom-defined `constexpr` unaligned loading functions, standardizing the approach to unaligned data access within `rapidhash` and improving code consistency across the `folly` project.

1 filesmaint

Work Patterns

Beta

Commit activity distribution by hour and day of week. Shows when this developer is most active.

Collaboration

Beta

Developers who frequently work on the same files and symbols. Higher score means stronger code collaboration.

NavigaraNavigara
OrganizationsDistributionCompareResearch