Commits · clo-github/main · CodeLinaro / le / external / github / pytorch / cpuinfo

Mar 17, 2024

Include support for Windows on Arm on BUILD.bazel along with proper Volterra detection (#220) · 6543fec0

Everton Constantino authored 11 months ago

This MR includes support for building with Bazel  on cpu `arm64_windows`, I also tried this on my Volterra Windows Dev Kit and noticed that the core string seems different from what the current source code defines. I don't know if this is because my hardware is a bit different or not. 

I ran the tests with the following results

```
[==========] Running 132 tests from 28 test suites.
[----------] Global test environment set-up.
[----------] 1 test from PROCESSORS_COUNT
[ RUN      ] PROCESSORS_COUNT.non_zero
[       OK ] PROCESSORS_COUNT.non_zero (0 ms)
[----------] 1 test from PROCESSORS_COUNT (0 ms total)

[----------] 1 test from PROCESSORS
[ RUN      ] PROCESSORS.non_null
[       OK ] PROCESSORS.non_null (0 ms)
[----------] 1 test from PROCESSORS (0 ms total)

[----------] 13 tests from PROCESSOR
[ RUN      ] PROCESSOR.non_null
[       OK ] PROCESSOR.non_null (0 ms)
[ RUN      ] PROCESSOR.valid_smt_id
[       OK ] PROCESSOR.valid_smt_id (0 ms)
[ RUN      ] PROCESSOR.valid_core
[       OK ] PROCESSOR.valid_core (0 ms)
[ RUN      ] PROCESSOR.consistent_core
[       OK ] PROCESSOR.consistent_core (0 ms)
[ RUN      ] PROCESSOR.valid_cluster
[       OK ] PROCESSOR.valid_cluster (0 ms)
[ RUN      ] PROCESSOR.consistent_cluster
[       OK ] PROCESSOR.consistent_cluster (0 ms)
[ RUN      ] PROCESSOR.valid_package
[       OK ] PROCESSOR.valid_package (0 ms)
[ RUN      ] PROCESSOR.consistent_package
[       OK ] PROCESSOR.consistent_package (0 ms)
[ RUN      ] PROCESSOR.consistent_l1i
[       OK ] PROCESSOR.consistent_l1i (0 ms)
[ RUN      ] PROCESSOR.consistent_l1d
[       OK ] PROCESSOR.consistent_l1d (0 ms)
[ RUN      ] PROCESSOR.consistent_l2
[       OK ] PROCESSOR.consistent_l2 (0 ms)
[ RUN      ] PROCESSOR.consistent_l3
[       OK ] PROCESSOR.consistent_l3 (0 ms)
[ RUN      ] PROCESSOR.consistent_l4
[       OK ] PROCESSOR.consistent_l4 (0 ms)
[----------] 13 tests from PROCESSOR (7 ms total)

[----------] 1 test from CORES_COUNT
[ RUN      ] CORES_COUNT.within_bounds
[       OK ] CORES_COUNT.within_bounds (0 ms)
[----------] 1 test from CORES_COUNT (0 ms total)

[----------] 1 test from CORES
[ RUN      ] CORES.non_null
[       OK ] CORES.non_null (0 ms)
[----------] 1 test from CORES (0 ms total)

[----------] 10 tests from CORE
[ RUN      ] CORE.non_null
[       OK ] CORE.non_null (0 ms)
[ RUN      ] CORE.non_zero_processors
[       OK ] CORE.non_zero_processors (0 ms)
[ RUN      ] CORE.consistent_processors
[       OK ] CORE.consistent_processors (0 ms)
[ RUN      ] CORE.valid_core_id
[       OK ] CORE.valid_core_id (0 ms)
[ RUN      ] CORE.valid_cluster
[       OK ] CORE.valid_cluster (0 ms)
[ RUN      ] CORE.consistent_cluster
[       OK ] CORE.consistent_cluster (0 ms)
[ RUN      ] CORE.valid_package
[       OK ] CORE.valid_package (0 ms)
[ RUN      ] CORE.consistent_package
[       OK ] CORE.consistent_package (0 ms)
[ RUN      ] CORE.known_vendor
[       OK ] CORE.known_vendor (0 ms)
[ RUN      ] CORE.known_uarch
[       OK ] CORE.known_uarch (0 ms)
[----------] 10 tests from CORE (5 ms total)

[----------] 1 test from CLUSTERS_COUNT
[ RUN      ] CLUSTERS_COUNT.within_bounds
[       OK ] CLUSTERS_COUNT.within_bounds (0 ms)
[----------] 1 test from CLUSTERS_COUNT (0 ms total)

[----------] 1 test from CLUSTERS
[ RUN      ] CLUSTERS.non_null
[       OK ] CLUSTERS.non_null (0 ms)
[----------] 1 test from CLUSTERS (0 ms total)

[----------] 14 tests from CLUSTER
[ RUN      ] CLUSTER.non_null
[       OK ] CLUSTER.non_null (0 ms)
[ RUN      ] CLUSTER.non_zero_processors
[       OK ] CLUSTER.non_zero_processors (0 ms)
[ RUN      ] CLUSTER.valid_processors
[       OK ] CLUSTER.valid_processors (0 ms)
[ RUN      ] CLUSTER.consistent_processors
[       OK ] CLUSTER.consistent_processors (0 ms)
[ RUN      ] CLUSTER.non_zero_cores
[       OK ] CLUSTER.non_zero_cores (0 ms)
[ RUN      ] CLUSTER.valid_cores
[       OK ] CLUSTER.valid_cores (0 ms)
[ RUN      ] CLUSTER.consistent_cores
[       OK ] CLUSTER.consistent_cores (0 ms)
[ RUN      ] CLUSTER.valid_cluster_id
[       OK ] CLUSTER.valid_cluster_id (0 ms)
[ RUN      ] CLUSTER.valid_package
[       OK ] CLUSTER.valid_package (0 ms)
[ RUN      ] CLUSTER.consistent_package
[       OK ] CLUSTER.consistent_package (0 ms)
[ RUN      ] CLUSTER.consistent_vendor
[       OK ] CLUSTER.consistent_vendor (0 ms)
[ RUN      ] CLUSTER.consistent_uarch
[       OK ] CLUSTER.consistent_uarch (0 ms)
[ RUN      ] CLUSTER.consistent_midr
[       OK ] CLUSTER.consistent_midr (0 ms)
[ RUN      ] CLUSTER.consistent_frequency
[       OK ] CLUSTER.consistent_frequency (0 ms)
[----------] 14 tests from CLUSTER (7 ms total)

[----------] 1 test from PACKAGES_COUNT
[ RUN      ] PACKAGES_COUNT.within_bounds
[       OK ] PACKAGES_COUNT.within_bounds (0 ms)
[----------] 1 test from PACKAGES_COUNT (0 ms total)

[----------] 1 test from PACKAGES
[ RUN      ] PACKAGES.non_null
[       OK ] PACKAGES.non_null (0 ms)
[----------] 1 test from PACKAGES (0 ms total)

[----------] 10 tests from PACKAGE
[ RUN      ] PACKAGE.non_null
[       OK ] PACKAGE.non_null (0 ms)
[ RUN      ] PACKAGE.non_zero_processors
[       OK ] PACKAGE.non_zero_processors (0 ms)
[ RUN      ] PACKAGE.valid_processors
[       OK ] PACKAGE.valid_processors (0 ms)
[ RUN      ] PACKAGE.consistent_processors
[       OK ] PACKAGE.consistent_processors (0 ms)
[ RUN      ] PACKAGE.non_zero_cores
[       OK ] PACKAGE.non_zero_cores (0 ms)
[ RUN      ] PACKAGE.valid_cores
[       OK ] PACKAGE.valid_cores (0 ms)
[ RUN      ] PACKAGE.consistent_cores
[       OK ] PACKAGE.consistent_cores (0 ms)
[ RUN      ] PACKAGE.non_zero_clusters
[       OK ] PACKAGE.non_zero_clusters (0 ms)
[ RUN      ] PACKAGE.valid_clusters
[       OK ] PACKAGE.valid_clusters (0 ms)
[ RUN      ] PACKAGE.consistent_cluster
[       OK ] PACKAGE.consistent_cluster (0 ms)
[----------] 10 tests from PACKAGE (5 ms total)

[----------] 1 test from UARCHS_COUNT
[ RUN      ] UARCHS_COUNT.within_bounds
[       OK ] UARCHS_COUNT.within_bounds (0 ms)
[----------] 1 test from UARCHS_COUNT (0 ms total)

[----------] 1 test from UARCHS
[ RUN      ] UARCHS.non_null
[       OK ] UARCHS.non_null (0 ms)
[----------] 1 test from UARCHS (0 ms total)

[----------] 5 tests from UARCH
[ RUN      ] UARCH.non_null
[       OK ] UARCH.non_null (0 ms)
[ RUN      ] UARCH.non_zero_processors
[       OK ] UARCH.non_zero_processors (0 ms)
[ RUN      ] UARCH.valid_processors
[       OK ] UARCH.valid_processors (0 ms)
[ RUN      ] UARCH.non_zero_cores
[       OK ] UARCH.non_zero_cores (0 ms)
[ RUN      ] UARCH.valid_cores
[       OK ] UARCH.valid_cores (0 ms)
[----------] 5 tests from UARCH (2 ms total)

[----------] 1 test from L1I_CACHES_COUNT
[ RUN      ] L1I_CACHES_COUNT.within_bounds
[       OK ] L1I_CACHES_COUNT.within_bounds (0 ms)
[----------] 1 test from L1I_CACHES_COUNT (0 ms total)

[----------] 1 test from L1I_CACHES
[ RUN      ] L1I_CACHES.non_null
[       OK ] L1I_CACHES.non_null (0 ms)
[----------] 1 test from L1I_CACHES (0 ms total)

[----------] 13 tests from L1I_CACHE
[ RUN      ] L1I_CACHE.non_null
[       OK ] L1I_CACHE.non_null (0 ms)
[ RUN      ] L1I_CACHE.non_zero_size
[       OK ] L1I_CACHE.non_zero_size (0 ms)
[ RUN      ] L1I_CACHE.valid_size
[       OK ] L1I_CACHE.valid_size (0 ms)
[ RUN      ] L1I_CACHE.non_zero_associativity
[       OK ] L1I_CACHE.non_zero_associativity (0 ms)
[ RUN      ] L1I_CACHE.non_zero_partitions
[       OK ] L1I_CACHE.non_zero_partitions (0 ms)
[ RUN      ] L1I_CACHE.non_zero_line_size
[       OK ] L1I_CACHE.non_zero_line_size (0 ms)
[ RUN      ] L1I_CACHE.power_of_2_line_size
[       OK ] L1I_CACHE.power_of_2_line_size (0 ms)
[ RUN      ] L1I_CACHE.reasonable_line_size
[       OK ] L1I_CACHE.reasonable_line_size (0 ms)
[ RUN      ] L1I_CACHE.valid_flags
[       OK ] L1I_CACHE.valid_flags (0 ms)
[ RUN      ] L1I_CACHE.non_inclusive
[       OK ] L1I_CACHE.non_inclusive (0 ms)
[ RUN      ] L1I_CACHE.non_zero_processors
[       OK ] L1I_CACHE.non_zero_processors (0 ms)
[ RUN      ] L1I_CACHE.valid_processors
[       OK ] L1I_CACHE.valid_processors (0 ms)
[ RUN      ] L1I_CACHE.consistent_processors
[       OK ] L1I_CACHE.consistent_processors (0 ms)
[----------] 13 tests from L1I_CACHE (7 ms total)

[----------] 1 test from L1D_CACHES_COUNT
[ RUN      ] L1D_CACHES_COUNT.within_bounds
[       OK ] L1D_CACHES_COUNT.within_bounds (0 ms)
[----------] 1 test from L1D_CACHES_COUNT (0 ms total)

[----------] 1 test from L1D_CACHES
[ RUN      ] L1D_CACHES.non_null
[       OK ] L1D_CACHES.non_null (0 ms)
[----------] 1 test from L1D_CACHES (0 ms total)

[----------] 13 tests from L1D_CACHE
[ RUN      ] L1D_CACHE.non_null
[       OK ] L1D_CACHE.non_null (0 ms)
[ RUN      ] L1D_CACHE.non_zero_size
[       OK ] L1D_CACHE.non_zero_size (0 ms)
[ RUN      ] L1D_CACHE.valid_size
[       OK ] L1D_CACHE.valid_size (0 ms)
[ RUN      ] L1D_CACHE.non_zero_associativity
[       OK ] L1D_CACHE.non_zero_associativity (0 ms)
[ RUN      ] L1D_CACHE.non_zero_partitions
[       OK ] L1D_CACHE.non_zero_partitions (0 ms)
[ RUN      ] L1D_CACHE.non_zero_line_size
[       OK ] L1D_CACHE.non_zero_line_size (0 ms)
[ RUN      ] L1D_CACHE.power_of_2_line_size
[       OK ] L1D_CACHE.power_of_2_line_size (0 ms)
[ RUN      ] L1D_CACHE.reasonable_line_size
[       OK ] L1D_CACHE.reasonable_line_size (0 ms)
[ RUN      ] L1D_CACHE.valid_flags
[       OK ] L1D_CACHE.valid_flags (0 ms)
[ RUN      ] L1D_CACHE.non_inclusive
[       OK ] L1D_CACHE.non_inclusive (0 ms)
[ RUN      ] L1D_CACHE.non_zero_processors
[       OK ] L1D_CACHE.non_zero_processors (0 ms)
[ RUN      ] L1D_CACHE.valid_processors
[       OK ] L1D_CACHE.valid_processors (0 ms)
[ RUN      ] L1D_CACHE.consistent_processors
[       OK ] L1D_CACHE.consistent_processors (0 ms)
[----------] 13 tests from L1D_CACHE (7 ms total)

[----------] 1 test from L2_CACHES_COUNT
[ RUN      ] L2_CACHES_COUNT.within_bounds
[       OK ] L2_CACHES_COUNT.within_bounds (0 ms)
[----------] 1 test from L2_CACHES_COUNT (0 ms total)

[----------] 1 test from L2_CACHES
[ RUN      ] L2_CACHES.non_null
[       OK ] L2_CACHES.non_null (0 ms)
[----------] 1 test from L2_CACHES (0 ms total)

[----------] 12 tests from L2_CACHE
[ RUN      ] L2_CACHE.non_null
[       OK ] L2_CACHE.non_null (0 ms)
[ RUN      ] L2_CACHE.non_zero_size
[       OK ] L2_CACHE.non_zero_size (0 ms)
[ RUN      ] L2_CACHE.valid_size
[       OK ] L2_CACHE.valid_size (0 ms)
[ RUN      ] L2_CACHE.non_zero_associativity
[       OK ] L2_CACHE.non_zero_associativity (0 ms)
[ RUN      ] L2_CACHE.non_zero_partitions
[       OK ] L2_CACHE.non_zero_partitions (0 ms)
[ RUN      ] L2_CACHE.non_zero_line_size
[       OK ] L2_CACHE.non_zero_line_size (0 ms)
[ RUN      ] L2_CACHE.power_of_2_line_size
[       OK ] L2_CACHE.power_of_2_line_size (0 ms)
[ RUN      ] L2_CACHE.reasonable_line_size
[       OK ] L2_CACHE.reasonable_line_size (0 ms)
[ RUN      ] L2_CACHE.valid_flags
[       OK ] L2_CACHE.valid_flags (0 ms)
[ RUN      ] L2_CACHE.non_zero_processors
[       OK ] L2_CACHE.non_zero_processors (0 ms)
[ RUN      ] L2_CACHE.valid_processors
[       OK ] L2_CACHE.valid_processors (0 ms)
[ RUN      ] L2_CACHE.consistent_processors
[       OK ] L2_CACHE.consistent_processors (0 ms)
[----------] 12 tests from L2_CACHE (6 ms total)

[----------] 1 test from L3_CACHES_COUNT
[ RUN      ] L3_CACHES_COUNT.within_bounds
[       OK ] L3_CACHES_COUNT.within_bounds (0 ms)
[----------] 1 test from L3_CACHES_COUNT (0 ms total)

[----------] 12 tests from L3_CACHE
[ RUN      ] L3_CACHE.non_null
[       OK ] L3_CACHE.non_null (0 ms)
[ RUN      ] L3_CACHE.non_zero_size
[       OK ] L3_CACHE.non_zero_size (0 ms)
[ RUN      ] L3_CACHE.valid_size
[       OK ] L3_CACHE.valid_size (0 ms)
[ RUN      ] L3_CACHE.non_zero_associativity
[       OK ] L3_CACHE.non_zero_associativity (0 ms)
[ RUN      ] L3_CACHE.non_zero_partitions
[       OK ] L3_CACHE.non_zero_partitions (0 ms)
[ RUN      ] L3_CACHE.non_zero_line_size
[       OK ] L3_CACHE.non_zero_line_size (0 ms)
[ RUN      ] L3_CACHE.power_of_2_line_size
[       OK ] L3_CACHE.power_of_2_line_size (0 ms)
[ RUN      ] L3_CACHE.reasonable_line_size
[       OK ] L3_CACHE.reasonable_line_size (0 ms)
[ RUN      ] L3_CACHE.valid_flags
[       OK ] L3_CACHE.valid_flags (0 ms)
[ RUN      ] L3_CACHE.non_zero_processors
[       OK ] L3_CACHE.non_zero_processors (0 ms)
[ RUN      ] L3_CACHE.valid_processors
[       OK ] L3_CACHE.valid_processors (0 ms)
[ RUN      ] L3_CACHE.consistent_processors
[       OK ] L3_CACHE.consistent_processors (0 ms)
[----------] 12 tests from L3_CACHE (6 ms total)

[----------] 1 test from L4_CACHES_COUNT
[ RUN      ] L4_CACHES_COUNT.within_bounds
[       OK ] L4_CACHES_COUNT.within_bounds (0 ms)
[----------] 1 test from L4_CACHES_COUNT (0 ms total)

[----------] 12 tests from L4_CACHE
[ RUN      ] L4_CACHE.non_null
[       OK ] L4_CACHE.non_null (0 ms)
[ RUN      ] L4_CACHE.non_zero_size
[       OK ] L4_CACHE.non_zero_size (0 ms)
[ RUN      ] L4_CACHE.valid_size
[       OK ] L4_CACHE.valid_size (0 ms)
[ RUN      ] L4_CACHE.non_zero_associativity
[       OK ] L4_CACHE.non_zero_associativity (0 ms)
[ RUN      ] L4_CACHE.non_zero_partitions
[       OK ] L4_CACHE.non_zero_partitions (0 ms)
[ RUN      ] L4_CACHE.non_zero_line_size
[       OK ] L4_CACHE.non_zero_line_size (0 ms)
[ RUN      ] L4_CACHE.power_of_2_line_size
[       OK ] L4_CACHE.power_of_2_line_size (0 ms)
[ RUN      ] L4_CACHE.reasonable_line_size
[       OK ] L4_CACHE.reasonable_line_size (0 ms)
[ RUN      ] L4_CACHE.valid_flags
[       OK ] L4_CACHE.valid_flags (0 ms)
[ RUN      ] L4_CACHE.non_zero_processors
[       OK ] L4_CACHE.non_zero_processors (0 ms)
[ RUN      ] L4_CACHE.valid_processors
[       OK ] L4_CACHE.valid_processors (0 ms)
[ RUN      ] L4_CACHE.consistent_processors
[       OK ] L4_CACHE.consistent_processors (0 ms)
[----------] 12 tests from L4_CACHE (6 ms total)

[----------] Global test environment tear-down
[==========] 132 tests from 28 test suites ran. (93 ms total)
[  PASSED  ] 132 tests.
```

with `cpu-info.exe` returning

```
Packages:
        0: Snapdragon (TM) 8cx Gen 3
Microarchitectures:
        4x Cortex-A78
        4x Cortex-X1
Cores:
        0: 1 processor (0), ARM Cortex-A78
        1: 1 processor (1), ARM Cortex-A78
        2: 1 processor (2), ARM Cortex-A78
        3: 1 processor (3), ARM Cortex-A78
        4: 1 processor (4), ARM Cortex-X1
        5: 1 processor (5), ARM Cortex-X1
        6: 1 processor (6), ARM Cortex-X1
        7: 1 processor (7), ARM Cortex-X1
Logical processors:
        0
        1
        2
        3
        4
        5
        6
        7
```

and `isa-info.exe` returning

```
Instruction sets:
        ARM v8.1 atomics: yes
        ARM v8.1 SQRDMLxH: yes
        ARM v8.2 FP16 arithmetics: yes
        ARM v8.2 FHM: no
        ARM v8.2 BF16: no
        ARM v8.2 Int8 dot product: yes
        ARM v8.2 Int8 matrix multiplication: no
        ARM v8.3 JS conversion: no
        ARM v8.3 complex: no
SIMD extensions:
        ARM SVE: no
        ARM SVE 2: no
Cryptography extensions:
        AES: yes
        SHA1: yes
        SHA2: yes
        PMULL: yes
        CRC32: yes
```

6543fec0

Mar 15, 2024

Bazel-support: Add MODUEL.bazel to support Bzlmod (#229) · fb08ae01

Vertexwahn authored 1 year ago

This PR adds a `MODULE.bazel` file. This is needed for [Bzlmod](https://bazel.build/external/mod-command) support of Bazel. In the long term this will replace the `WORKSPACE.bazel` file. In the meantime, both files are needed.

fb08ae01

Feb 26, 2024
- Merge pull request #225 from fbarchard/break · aa4b2163
  Digant Desai authored 1 year ago
```
cachebreak
```
  aa4b2163
Feb 24, 2024
- Add missing break to cpuinfo_x86_decode_cache_descriptor · 39907714
  Frank Barchard authored 1 year ago
  
  39907714
Jan 23, 2024

ci: Add an Ubuntu:22.04 builder for RISC-V (#219) · 9484a6c5

Mark Ryan authored 1 year ago

cpuinfo is built for riscv64 using a riscv64 container. binfmt_misc
allows the riscv64 binaries in the container to be executed with QEMU.
This is slower than cross compiling but as there's not that much code
the build times are acceptable. It takes just under 6 minutes for the
full riscv64 github action to run. We also have the option of running
some of the built RISC-V binaries, e.g., unit tests, in the CI. It
should be easy to expand the matrix to add CI for other architectures
not natively supported by github actions.

9484a6c5

Upgrade to warning when name is truncated (#216) · 434970b5

Prashanth Swaminathan authored 1 year ago

Signal to users that the name field may not produce the expected string
if the chipset name and revision exceeds the maximum size of the buffer.
In practice, this is unlikely as the buffer size is reasonably high for
a chipset name/revision.

434970b5

Jan 22, 2024

Fix RISC-V Linux build again (#215) · 9321265a

Mark Ryan authored 1 year ago

PR https://github.com/pytorch/cpuinfo/pull/204 broke the RISC-V
build by including for a second time a header file that currently
only exists in the RISC-V Android NDK.  The header is not yet
available in mainstream Linux distributions.  The header in question,
<sys/hwprobe.h>, is already included when building for Android
at the top of riscv-hw.c so the second include is unnecessary and
can be safely removed.

9321265a

Jan 09, 2024

Run Bazel build in Github Actions (#213) · 76cc10d6

Prashanth Swaminathan authored 1 year ago

As some clients rely on the Bazel build, add a workflow to verify at
least one Bazel target (linux-x86). Also, perform some minor cleanup to
comments and target branches in our workflow files.

76cc10d6

Jan 08, 2024

Adjust log levels of /proc/cpuinfo parsing (#209) · 05027368

Prashanth Swaminathan authored 1 year ago

There are a few steps in our parsing logic where we skip lines that don't match the expectations of the /proc/cpuinfo node. Reduce the log level of these lines to 'debug', as these are not generally errors and are noisy on systems that have unique cpuinfo key-value pairs.

When parsing logic encounters a higher-than-expected processor number, increase the level to warning, to indicate that an error may have occurred in the parsing step.

This does not fully address #19 but resolves the underlying noise reported.

05027368

Jan 06, 2024

Add .clang-format to enforce project style (#204) · 42bff7ad

Prashanth Swaminathan authored 1 year ago

* Add .clang-format to enforce project style

The settings here match the current settings for the pytorch/pytorch
project, with the exception that 8-character-width tabs are preferred in
place of spaces.

* Mass reformat of all .c and .h files

Now that we have a clang-format file defined, clean up all usages once.

* Enable clang-format-check workflow

Enforce clang-format consistency on all new changes.

42bff7ad

Jan 05, 2024

Fix RISC-V Linux build (#212) · 313524ab

Mark Ryan authored 1 year ago

Cpuinfo was failing to build on RISC-V Linux distributions, e.g.,
Ubuntu 23.10, as it includes a header file sys/hwprobe.h that is
not yet provided by glibc (although it is provided by bionic). We
fix the issue by only including sys/hwprobe.h when building for
Android, and invoking the hwprobe syscall directly on other
Linux distributions. The Android specific check can be removed in
the future once sys/hwprobe.h becomes available in glibc.

313524ab

Dec 08, 2023

Improve smallfile callback (#211) · 2f4c278f

Iacopo Colonnelli authored 1 year ago

This PR improves the smallfile callback error reporting, passing the
name of the inspected file in the `filename` argument instead of forcing
it to be `KERNEL_MAX_FILENAME` as before.

2f4c278f

Nov 30, 2023

Fix chipset enum name to include 'vendor_' (#210) · b8b29a16

Prashanth Swaminathan authored 1 year ago

The original change that introduced this should have used a consistent prefix for all enum types, for consistency sake.

b8b29a16

Nov 28, 2023

Fix CPU_SET dynamic allocation and leak (#205) · 9d809924

Prashanth Swaminathan authored 1 year ago

The initial implementation had a number of issues:
- The allocation of the CPU_SET should be checked for a NULL return.
- The CPU_*_S macros should be used when working with dynamic sets.
- The CPU_SET needs to be cleared via CPU_ZERO_S before use.
- Dynamic CPU_SETs need to be freed after use.
- The __riscv_hwprobe syscall is expecting a set *size* not a *count*.

9d809924

Nov 20, 2023
- Add android_riscv64 to BUILD.bazel (#201) · ef634603
  Prashanth Swaminathan authored 1 year ago
  
  ef634603
Nov 16, 2023

[arm] fix the logic for identifying the valid processors (#197) · 20bd32c1

snadampal authored 1 year ago

The current logic for valid processor detection is reporting all cpus irrespective of whether they are online or not. so, it's causing thread over-subscription for the scenarios where the online cpu count < the actual cpus. This is fixed by publishing only the online cpu count as the valid processors.

20bd32c1

Fix size check of max processor count (#199) · 9f13d15a

Prashanth Swaminathan authored 1 year ago

On 64-bit systems, size_t will not overflow when the function to get max
processors returns UINT32_MAX. Use the appropriate uint32_t type.

9f13d15a

Nov 14, 2023

Add limited support for RISC-V initialization (#190) · 4e5be9e1

Prashanth Swaminathan authored 1 year ago

* Adds header definitions for RISCV32 and RISCV64, and support in Bazel
  files for RISCV64. Adds ISA information for RISC-V and hwcap support.

* Adds support to construct the processor, core, cluster and package
  information reported by the system.

* Remaining support required for:
  - Inferring uarch of each processor (reports unknown for now).
  - Reading cache information (left empty for now).

Test: Build and ran cpu_info and isa_info on RISC-V QEMU instance and
RISC-V Android emulator. Confirmed that it properly reports the ISA
information as well as processor and cluster counts.

4e5be9e1

Nov 04, 2023

Add detection of Intel x86 AVX-VNNI instructions. (#196) · d6860c47

Quentin Khan authored 1 year ago

Tested using Intel SDE:

```
bash scripts/local-build.sh

OPTIONS=()
PLATFORMS=()

OPTIONS+=(-quark); PLATFORMS+=("Quark")
OPTIONS+=(-p4); PLATFORMS+=("Pentium4")
OPTIONS+=(-p4p); PLATFORMS+=("Pentium4 Prescott")
OPTIONS+=(-mrm); PLATFORMS+=("Merom")
OPTIONS+=(-pnr); PLATFORMS+=("Penryn")
OPTIONS+=(-nhm); PLATFORMS+=("Nehalem")
OPTIONS+=(-wsm); PLATFORMS+=("Westmere")
OPTIONS+=(-snb); PLATFORMS+=("Sandy Bridge")
OPTIONS+=(-ivb); PLATFORMS+=("Ivy Bridge")
OPTIONS+=(-hsw); PLATFORMS+=("Haswell")
OPTIONS+=(-bdw); PLATFORMS+=("Broadwell")
OPTIONS+=(-slt); PLATFORMS+=("Saltwell")
OPTIONS+=(-slm); PLATFORMS+=("Silvermont")
OPTIONS+=(-glm); PLATFORMS+=("Goldmont")
OPTIONS+=(-glp); PLATFORMS+=("Goldmont Plus")
OPTIONS+=(-tnt); PLATFORMS+=("Tremont")
OPTIONS+=(-snr); PLATFORMS+=("Snow Ridge")
OPTIONS+=(-skl); PLATFORMS+=("Skylake")
OPTIONS+=(-cnl); PLATFORMS+=("Cannon Lake")
OPTIONS+=(-icl); PLATFORMS+=("Ice Lake")
OPTIONS+=(-skx); PLATFORMS+=("Skylake server")
OPTIONS+=(-clx); PLATFORMS+=("Cascade Lake")
OPTIONS+=(-cpx); PLATFORMS+=("Cooper Lake")
OPTIONS+=(-icx); PLATFORMS+=("Ice Lake server")
OPTIONS+=(-knl); PLATFORMS+=("Knights landing")
OPTIONS+=(-knm); PLATFORMS+=("Knights mill")
OPTIONS+=(-tgl); PLATFORMS+=("Tiger Lake")
OPTIONS+=(-adl); PLATFORMS+=("Alder Lake")
OPTIONS+=(-mtl); PLATFORMS+=("Meteor Lake")
OPTIONS+=(-rpl); PLATFORMS+=("Raptor Lake")
OPTIONS+=(-spr); PLATFORMS+=("Sapphire Rapids")
OPTIONS+=(-gnr); PLATFORMS+=("Granite Rapids")
OPTIONS+=(-srf); PLATFORMS+=("Sierra Forest")
OPTIONS+=(-grr); PLATFORMS+=("Grand Ridge")
OPTIONS+=(-future); PLATFORMS+=("Future chip")

SDE_BIN="path/to/sde"

for I in "${!PLATFORMS[@]}"; do
  echo "${PLATFORMS["${I}"]}"
  "${SDE_BIN}" "${OPTIONS[$I]}" -- ./build/local/isa-info | grep "AVXVNNI"
done
```

Result:

```
Quark
        [error]
Merom
        [error]
Penryn
        [error]
Nehalem
        [error]
Westmere
        AVXVNNI: no
Sandy Bridge
        AVXVNNI: no
Ivy Bridge
        AVXVNNI: no
Haswell
        AVXVNNI: no
Broadwell
        AVXVNNI: no
Saltwell
        [error]
Silvermont
        AVXVNNI: no
Goldmont
        AVXVNNI: no
Goldmont Plus
        AVXVNNI: no
Tremont
        AVXVNNI: no
Snow Ridge
        AVXVNNI: no
Skylake
        AVXVNNI: no
Cannon Lake
        AVXVNNI: no
Ice Lake
        AVXVNNI: no
Skylake server
        AVXVNNI: no
Cascade Lake
        AVXVNNI: no
Cooper Lake
        AVXVNNI: no
Ice Lake server
        AVXVNNI: no
Knights landing
        AVXVNNI: no
Knights mill
        AVXVNNI: no
Tiger Lake
        AVXVNNI: no
Alder Lake
        AVXVNNI: yes
Meteor Lake
        AVXVNNI: yes
Raptor Lake
        AVXVNNI: yes
Sapphire Rapids
        AVXVNNI: yes
Granite Rapids
        AVXVNNI: yes
Sierra Forest
        AVXVNNI: yes
Grand Ridge
        AVXVNNI: yes
Future chip
        AVXVNNI: yes
```

d6860c47

Oct 19, 2023
- Add support for Arm Neoverse V2 (#194) · 76d5e8f5
  Paolo authored 1 year ago
  
  76d5e8f5
Aug 16, 2023
- Include intrin.h MSVC header in cpuinfo/utils.h · 959002f8
  Marat Dukhan authored 1 year ago
  
  959002f8
- Support building for ARM Linux with GLibC older than 2.16 · 9df83faa
  Marat Dukhan authored 1 year ago
  
  9df83faa
- Work around faulty implementations of NEON DOT instructions · dce131b2
  Marat Dukhan authored 1 year ago
```
Prevent detection of NEON DOT instruction set on Spreadtrum SC9863A and
Unisoc T310, where these instructions occasionally trigger SIGILL
```
  dce131b2
- Don't consider Cortex-A65 in AArch32 ISA detection · 3c8583da
  Marat Dukhan authored 1 year ago
```
Cortex-A65 is AArch64-only and not paired with AArch32-capable cores
```
  3c8583da
- Remove redundant newline after match_t · c15d5373
  Marat Dukhan authored 1 year ago
  
  c15d5373
- Detect Unisoc T-series chipsets · e00b4854
  Marat Dukhan authored 1 year ago
  
  e00b4854
- Remove redundant architecture version check in aarch32-isa.c · 8eab2028
  Marat Dukhan authored 1 year ago
  
  8eab2028
- Fix a bug in load_u24le introduced in #178 · 6bd16265
  Marat Dukhan authored 1 year ago
  
  6bd16265
- Workaround unimplemented GetMaximumProcessorGroupCount on WINE · 8dd68175
  Marat Dukhan authored 1 year ago
  
  8dd68175
- Fix UB in load_u16le/load_u24le/load_u32le · baa1ee9d
  Marat Dukhan authored 1 year ago
  
  baa1ee9d
- Enable CXX only when needed for tests/benchmarks · e4a50730
  James Hilliard authored 2 years ago
  
  e4a50730
- Remove redudant ctype.h include · 0a85bfea
  Marat Dukhan authored 1 year ago
  
  0a85bfea
Aug 15, 2023
- Update CMakeLists.txt: Add cmake 3.27 support · 859edec1
  Changming Sun authored 1 year ago
  
  859edec1
Aug 10, 2023

Refactor reporting of Neoverse cores in cpu-info · c13d0bbb
Marat Dukhan authored 1 year ago

c13d0bbb

Simplify registry helper functions for Windows on Arm · 2ad28d14

Lingkai Dong authored 1 year ago

The helper functions read_registry() and get_system_info_from_registry()
now return pointers to requested data on success or NULL on failure.

2ad28d14

Fix char width mismatch on Windows on Arm · 645c28c2

Lingkai Dong authored 1 year ago

Note: The data size reported by the last argument of RegGetValueW()
is the number of bytes, not the number of wchar_t in the string.

Change-Id: Ib57083791b5dc7ef97baf1e1e48bd070148a2032

645c28c2

Fix buffer overflow when counting uarchs on Windows on Arm · 6c983c50

Lingkai Dong authored 1 year ago

Fix an issue that the index of the uarchs[] array is incremented at
a wrong time which causes buffer overflow.

For example, if uarchs[] has two elements, with the first four cores
having one uarch and four cores having the other uarch, then
* without this fix, the wrong counts are
  uarchs[0].core_count == 1
  uarchs[1].core_count == 4
  uarchs[2].core_count == 3 (this overflows uarchs[])
* with this fix, the correct counts are
  uarchs[0].core_count == 4
  uarchs[1].core_count == 4

Change-Id: I9584aabf7859997f2826f8acb1b96aa3b8a5ee54

6c983c50

Fix the size of uarchs array on Windows on Arm · a00dacd6

Lingkai Dong authored 1 year ago

The number of elements in woa_chip_info.uarchs[] corresponds to the
maximum number of efficiency classes, rather than the number of chip
names.

Change-Id: I2166dccf085a5de52e80577617c8112cc026c961

a00dacd6

Fix reporting of processor found in Windows registry on Arm · d2a729c3

Lingkai Dong authored 1 year ago

There was a bug that get_system_info_from_registry() always returned
false regardless of what happened. To fix this, we could have returned
the success or failure of the function. However, the status can also
be inferred from whether the pointer to the chip info is NULL or not,
so we remove the redundant return type for simplicity.

Change-Id: If468000bf60f917892f776188a8fa29fcc0d4b7f

d2a729c3

Ensure initialization is done only once on Windows on Arm · fb31a582

Lingkai Dong authored 1 year ago

The function cpuinfo_arm_windows_init is called with
InitOnceExecuteOnce (a Windows API) which ensures the init is called
only once if it returns TRUE, even if multiple threads call it.
However, if the init returns FALSE, it will be called again from
other threads until TRUE is returned.

In our case, init should happen only once regardless of success or
failure, so cpuinfo_arm_windows_init should always return TRUE. The
actual status of initialization is indicated by setting the global
variable cpuinfo_is_initialized. This is consistent with how the x86
initialization is implemented.

Change-Id: I016a988f10e8484f81c55838b182819e6cd8c880

fb31a582