Error status for images and image state management
Instead of "killing" an image when something goes wrong during data upload, we need a good way to make it obvious the image failed to upload and notify the user. I propose we add an image error state and take a deeper look at image states in general.
There are a couple of use cases where this is important.
- Snapshots through nova
When a snapshot fails the image could be set to ERROR instead of being deleted/killed. This way users can see that a snapshot failed.
- Copy-from images
If a copy from image fails during the data copying by the async worker then there is no way for a user to know it failed and it would be unintuitive for the image resource to disappear.
- Failed uploads
In addition to error responses from Glance when an upload fails, it would be nice for that image resource to not be deleted automatically and have a way for the user to know it had errored. This way uploads could be retried without having to create a whole new image entity.
Blueprint information
- Status:
- Not started
- Approver:
- None
- Priority:
- Undefined
- Drafter:
- Feilong Wang
- Direction:
- Needs approval
- Assignee:
- Feilong Wang
- Definition:
- New
- Series goal:
- None
- Implementation:
- Unknown
- Milestone target:
- None
- Started by
- Completed by
Related branches
Related bugs
Sprints
Whiteboard
markwash:
I would like to propose that we address these problems with the addition of a control header in the
http requests for upload/create image requests. I'm not concerne with what the exact header would
be, but if provided it should tell the glance server "don't kill this image if the upload fails".
The question still remains as to how we indicate that the failure occurred.
To Mark,
I am curious that why the killed status is not displayed in glance. If the upload or snapshot fails, we'd better to display the image and show the error message for it in case the end user always try the operation.
dperaza: Does it make sense to include defect 119115 in the scope of this blueprint? Error could happen even before hitting upload from nova in snapshot path, for example while the image is streamed from instance disk. That would change to allow for client side to make the call on update to change state and include error. I also propose that we include the error text as an image reserved property so the client inspecting the image progress sees why the image is now on error.
dperaza:
By the way, killed images do not show when you list image but they do show if you directly query details by ID, so if you save your url before the error state you can always check status in that specific image even if it goes to kill, so even if we decide not to add a new state we should at least add and error property with the why
flwang:
@dperaza, I'm working on this bp which based on the discussion between ameade and I since I was working on an internal bug (146349). After discussed with ameade and markwash several times, we prefer to add "task" to track the image create action, and I think the "task" maybe used to track other actions as well. You can refer this link about the "task" proposal: https:/
dperaza:
@flwang, after reading your blueprint I still don't think it will handle bug 1191115. Both bp image-error-
To summarize, the use case here is tracking snapshot and snapshot failures. This is a task for nova (pun intended!). We do not want to track "error" states on images. An image should not be in an error state--it should either exist and be active, or not exist. Rather any errors should be tracked as part of a resource that models the act of creating the instance, such as import.
markwash rejected 2014-02-15