Maintenance5#5937
Conversation
02e7fd5 to
05d557b
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5937 +/- ##
==========================================
- Coverage 83.08% 83.03% -0.06%
==========================================
Files 277 277
Lines 30201 30160 -41
==========================================
- Hits 25094 25043 -51
- Misses 5107 5117 +10
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
6512ffb to
b8a7ce2
Compare
b8a7ce2 to
db3250f
Compare
addf162 to
6c12c47
Compare
| .devices | ||
| .write() | ||
| .unwrap() | ||
| .insert(new_range, Arc::downgrade(&device)) |
There was a problem hiding this comment.
Was there a bug here before? If another piece of code obtained the lock between the check above and here, and inserted a device that overlaps, this insert call would have replaced the device despite this function returning an error.
Am I understanding correctly?
There was a problem hiding this comment.
yes and no. If this code could execute in parallel then this could happen. But all the bus manipulation happens on the VMM thread, so race could happen.
| ); | ||
|
|
||
| Ok(()) | ||
| devices.insert(new_range, Arc::downgrade(&device)); |
There was a problem hiding this comment.
If we decide to keep assert for this function, we should assert the returned value is None here for extra peace of mind.
| /// | ||
| /// Panics if `len` is zero, if the range overflows, or if the new range overlaps with an | ||
| /// existing device. These conditions indicate a bug in the VMM address layout. | ||
| pub fn insert(&self, device: Arc<dyn BusDeviceSync>, base: u64, len: u64) { |
There was a problem hiding this comment.
I don't agree with using asserts instead of Result here. It's not obvious for the caller to know whether the given range overlaps or not, and it's also impossible to check at runtime, since devices are under a lock that might change between when the caller checks and the actual insertion.
I agree that current calls are static, that might change in the future.
There was a problem hiding this comment.
inputs into this (and remove) functions are obtained from ResourceAllocator and so must always be valid (ResourceAllocator must never return overlapping regions). If we would need to check the range at runtime (like if we implement BAR relocation) we would need to ask ResourceManager anyway to reserve that range and so the check if the range is used or not will be handled there. But if we would really need is_overlapping function in the Bus, we can add it.
There was a problem hiding this comment.
I think there's too much "distance" between ResourceAllocator and Bus, so it's not always obvious that the ranges come from there and have no repeated calls.
If in a specific place such relationship is obvious, callers can use unwrap and panic there.
There was a problem hiding this comment.
since there should be no other source of the ranges for the Bus, other than ResourceAllocator, I don't see a case where it would be realistic to call insert/remove without unwrap(). And if all places need to unwrap(), then it is better to add asserts in the call itself.
Also as you can see, there are other benefits to this approach, like removing Result return types from some other functions.
There was a problem hiding this comment.
Also as you can see, there are other benefits to this approach, like removing Result return types from some other functions
The same could be achieved by calling unwrap/expect on the caller.
| } | ||
|
|
||
| /// Removes the device at the given address space range. | ||
| pub fn remove(&self, base: u64, len: u64) -> Result<(), BusError> { |
There was a problem hiding this comment.
Similarly to insert, a Result here is better because preconditions aren't obvious or possible to validate at runtime
| } | ||
| let end = base.checked_add(len - 1).ok_or(BusError::InvalidRange)?; | ||
| Ok(BusRange { base, end }) | ||
| pub fn new(base: u64, len: u64) -> Self { |
There was a problem hiding this comment.
Since it's a bit controversial, I'll be explciit: I think that for this function using asserts is the right choice, as the constraints on the inputs are obvious, and simple to check from the caller if necessary.
| peer_port: pkt.hdr.src_port(), | ||
| }; | ||
|
|
||
| debug!( |
There was a problem hiding this comment.
Sorry for not commenting on earlier reviews.
These debug logs helped us understand the vsock issue of 1.16, and don't affect production performance since debug logs are disabled in production.
I think we should keep these
There was a problem hiding this comment.
ok, dropped the commit
Remove the local `type Result<T> = result::Result<T, BusError>` alias from vstate/bus.rs. Replace all usages with explicit `Result<T, BusError>` for clarity and consistency with the rest of the codebase. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
Replace the runtime assert! in Endpoint::new with a compile-time const assertion at the point where RCV_BUF_MAX_SIZE is defined. This enforces the invariant (RCV_BUF_MAX_SIZE <= MAX_WINDOW_SIZE) at compile time rather than at runtime. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
When a network device configuration is submitted with an existing device ID but different parameters, the previous device is silently destroyed and replaced. The ordering is justified by the fact that the new configuration can use same tap device, so we need to destroy old net device before creating a new one. The issue here is that if the new device fails to create, there is no old device to plug back. Add a warning log so the caller has visibility into this destructive operation. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
Replace the Peekable iterator pattern in descriptor chain construction with windows(2) for add_scatter_gather and enumerate with get(i+1) for add_desc_chain. This makes the intent clearer: each iteration processes a descriptor and links it to the next one in the chain. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
Bus::insert is only called during VMM setup with known addresses. A failure (zero-sized range, overflow, or overlap) indicates a programming error in the address layout, not a recoverable runtime condition. Replace the Result return type with asserts that panic on invariant violations. Bus::remove is only called with information we previously obtained when inserting the device, meaning the information must be valid. Remove the now-unused BusError::Overlap variant and the Bus(BusError) variants from MmioError, LegacyDeviceError, PciManagerError, AttachDeviceError, DevicePersistError, and DeviceManagerPersistError. PciSegment::new and PciSegment::build also no longer return Result since their only fallible operation was Bus::insert. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
Instead of having both HashMap and Vec, use single flat array of optionals. This simplifies the implementation and removes need for any memory allocations. Storage is efficient as well since `Option<Arc<...>>` is a simple pointer and there is a limit of 32 device in general. The only notable difference this brings is that the logic of obtaining the device id and adding the device now is more coupled togather since now there is no option to "reserve" a device_id. This is not an issue since all code always does follow the "obtain id -> add device" scheme. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
RateLimiter `new` was returning Result even though it cannot fail. Remove that and update all relevant code paths. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
Instead of manually specifying each byte for BUST_TYPE_ISA use fancy *b"..." syntax. This supposed to be easier to read. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
Make `warn!` logs used when guest incorrectly writes the config space of the device more coherent across devices. This also makes them provide more useful information (like what device emitted the log). Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
Do a single allocation when setting irq routing instead of pushing to KvmIrqRouting multiple times. Saves us from dynamic re-alloations. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
6c12c47 to
1306c38
Compare
Changes
See commits
Reason
Maintenance
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.PR Checklist
tools/devtool checkbuild --allto verify that the PR passesbuild checks on all supported architectures.
tools/devtool checkstyleto verify that the PR passes theautomated style checks.
how they are solving the problem in a clear and encompassing way.
in the PR.
CHANGELOG.md.Runbook for Firecracker API changes.
integration tests.
TODO.rust-vmm.