Skip to content
Snippets Groups Projects
  1. Dec 16, 2024
  2. Dec 11, 2024
  3. Dec 03, 2024
  4. Nov 26, 2024
  5. Nov 22, 2024
  6. Nov 18, 2024
  7. Nov 06, 2024
  8. Oct 31, 2024
  9. Oct 30, 2024
  10. Oct 28, 2024
  11. Oct 24, 2024
  12. Oct 23, 2024
    • Kuang-che Wu's avatar
      subcmds: reduce multiprocessing serialization overhead · 8da4861b
      Kuang-che Wu authored
      Follow the same approach as 39ffd997 to reduce serialization overhead.
      
      Below benchmarks are tested with 2.7k projects on my workstation
      (warm cache). git tracing is disabled for benchmark.
      
      (seconds)              | v2.48 | v2.48 | this CL | this CL
      	               |       |  -j32 |         |    -j32
      -----------------------------------------------------------
      with clean tree state:
      branches (none)        |   5.6 |   5.9 |    1.0  |    0.9
      status (clean)         |  21.3 |   9.4 |   19.4  |    4.7
      diff (none)            |   7.6 |   7.2 |    5.7  |    2.2
      prune (none)           |   5.7 |   6.1 |    1.3  |    1.2
      abandon (none)         |  19.4 |  18.6 |    0.9  |    0.8
      upload (none)          |  19.7 |  18.7 |    0.9  |    0.8
      forall -c true         |   7.5 |   7.6 |    0.6  |    0.6
      forall -c "git log -1" |  11.3 |  11.1 |    0.6  |    0.6
      
      with branches:
      start BRANCH --all     |  21.9 |  20.3 |   13.6  |    2.6
      checkout BRANCH        |  29.1 |  27.8 |    1.1  |    1.0
      branches (2)           |  28.0 |  28.6 |    1.5  |    1.3
      abandon BRANCH         |  29.2 |  27.5 |    9.7  |    2.2
      
      Bug: b/371638995
      Change-Id: I53989a3d1e43063587b3f52f852b1c2c56b49412
      Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/440221
      
      
      Reviewed-by: default avatarJosip Sokcevic <sokcevic@google.com>
      Tested-by: default avatarKuang-che Wu <kcwu@google.com>
      Commit-Queue: Kuang-che Wu <kcwu@google.com>
      8da4861b
    • Kuang-che Wu's avatar
      sync: reduce multiprocessing serialization overhead · 39ffd997
      Kuang-che Wu authored
      Background:
       - Manifest object is large (for projects like Android) in terms of
         serialization cost and size (more than 1mb).
       - Lots of Project objects usually share only a few manifest objects.
      
      Before this CL, Project objects were passed to workers via function
      parameters. Function parameters are pickled separately (in chunk). In
      other words, manifests are serialized again and again. The major
      serialization overhead of repo sync was
        O(manifest_size * projects / chunksize)
      
      This CL uses following tricks to reduce serialization overhead.
       - All projects are pickled in one invocation. Because Project objects
         share manifests, pickle library remembers which objects are already
         seen and avoid the serialization cost.
       - Pass the Project objects to workers at worker intialization time.
         And pass project index as function parameters instead. The number of
         workers is much smaller than the number of projects.
       - Worker init state are shared on Linux (fork based). So it requires
         zero serialization for Project objects.
      
      On Linux (fork based), the serialization overhead is
        O(projects)  --- one int per project
      On Windows (spawn based), the serialization overhead is
        O(manifest_size * min(workers, projects))
      
      Moreover, use chunksize=1 to avoid the chance that some workers are idle
      while other workers still have more than one job in their chunk queue.
      
      Using 2.7k projects as the baseline, originally "repo sync" no-op
      sync takes 31s for fetch and 25s for checkout on my Linux workstation.
      With this CL, it takes 12s for fetch and 1s for checkout.
      
      Bug: b/371638995
      Change-Id: Ifa22072ea54eacb4a5c525c050d84de371e87caa
      Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/439921
      
      
      Tested-by: default avatarKuang-che Wu <kcwu@google.com>
      Reviewed-by: default avatarJosip Sokcevic <sokcevic@google.com>
      Commit-Queue: Kuang-che Wu <kcwu@google.com>
      39ffd997
  13. Oct 18, 2024
  14. Oct 07, 2024
  15. Oct 03, 2024
  16. Sep 26, 2024
  17. Sep 25, 2024
  18. Sep 19, 2024
    • Brian Norris's avatar
      project: Copy and link files even with local branches · b577444a
      Brian Norris authored
      In the winding maze that constitutes Sync_LocalHalf(), there are paths
      in which we don't copy-and-link files. Examples include something like:
      
        cd some/project/
        repo start head .
        # do some work, make some commit, upload that commit to Gerrit
      
        [[ ... in the meantime, someone addes a <linkfile ...> for
           some/project/ in the manifest ... ]]
      
        cd some/project/
        git pull --rebase
        repo sync
      
      In this case, we never hit a `repo rebase` case, which might have saved
      us. Instead, the developer is left confused why some/project/ never had
      its <linkfile>s created.
      
      Notably, this opens up one more corner case in which <linkfile ... /> or
      <copyfile ... /> could potentially clobber existing work in the
      destination directory, but there are existing cases where that's true,
      and frankly, those seem like bigger holes than this new one.
      
      Change-Id: I394b0e4529023a8ee319dc25d03d513a19251a4a
      Reviewed-on: https://gerrit-review.googlesource.com/c/git-repo/+/437421
      
      
      Reviewed-by: default avatarJosip Sokcevic <sokcevic@google.com>
      Tested-by: default avatarBrian Norris <briannorris@google.com>
      Commit-Queue: Brian Norris <briannorris@google.com>
      b577444a
  19. Sep 12, 2024
  20. Aug 30, 2024
  21. Aug 28, 2024
  22. Jul 02, 2024
  23. Jul 01, 2024
  24. May 23, 2024
  25. May 16, 2024
  26. May 14, 2024
Loading