File uploads + REST = how to do it properly

 


File uploads and REST API: why so complicated?

Almost every web application requires the upload of files. It could be Word documents, user pictures or whatever the application needs. Sadly when an API is built the file upload is often a distinctive API call that differentiate from all the other API calls. So what does it make so complicated?

Sending files over JSON

You could send files over JSON of course. We only need to submit the original file name and the file contents, right?

{
    "filename": "file.txt",
    "contents": "Lorem ipsum"
}
In vanilla javascript you could make this JSON very easily from a file upload with the FileReader API:

document.getElementById('file_upload').addEventListener('change', function(event) {
    const file = event.target.files[0];
    if (file) {
        const reader = new FileReader();
        reader.onload = function(e) {
            const contents = e.target.result;
            const jsonObject = {
                 filename: file.name,
                 contents: contents
            };
            const jsonString = JSON.stringify(jsonObject);
            console.log(jsonString);
        };
        reader.readAsText(file);
    } else {
        console.error("No file selected.");
    }
});
While this looks simple, there are a few problems with uploading files like this. This solution works great with plaintext files, but it will be atrocious when uploading any non-text file. JSON is meant to be readable and writable by a human being, but image files are not. FileReader.readAsText assumes the file you upload is plaintext in UTF-8 encoding. A binary file could accidentally contain a combination of binary data that is the same as a UTF-8 character. It would be safer to use FileReader.readAsDataURL to send the file base 64 encoded instead. The problem with this is that every byte of a file will need 2 bytes to be sent as base64 JSON, so uploading a 10MB file would mean you have to upload 20MB of JSON. Also I have not seen any application streaming large JSON in an API call, so the server also needs to load the entire file in memory.

Multipart encoding

So let's see how the API Platform handles this. It has a very long page about it, but in reality you are just writing a regular form submit file upload action in Symfony with some metadata for OpenAPI, JSON+LD etc. That's a lot of effort for a file upload.

Let's make it simple by going back to the basics of a file upload. So let's look how a file upload works with traditional <form> submit:
<form method="post">
    <input type="file" name="file">
    <input type="submit">
</form>
If we do this we see we do not submit the file upload as it requires a special enctype attribute on form to actually submit the file in the request:

POST /api/example/ HTTP/1.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Accept-Encoding: gzip, deflate, br, zstd
Accept-Language: nl-NL,nl;q=0.9,en-US;q=0.8,en;q=0.7
Cache-Control: max-age=0
Connection: keep-alive
Content-Length: 46
Content-Type: application/x-www-form-urlencoded
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36

file=_0efa06bd-0f38-4f98-9db3-d19d496231dd.jpg
The content-type header tells the server it is sending data with format 'application/x-www-for-url-encoded' which is the format created by http_build_query() on the server. As you can see it only submits the file name, but no file data. To make it submit the file we add the enctype attribute to the form:
<form method="post" enctype="multipart/form-data">
    <input type="file" name="file">
    <input type="text" name="description">
    <input type="submit">
</form>
If we submit this form we do upload the image in the HTTP request:
POST /api/example HTTP/1.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Accept-Encoding: gzip, deflate, br, zstd
Accept-Language: nl-NL,nl;q=0.9,en-US;q=0.8,en;q=0.7
Cache-Control: max-age=0
Connection: keep-alive
Content-Length: 127175
Content-Type: multipart/form-data; boundary=12345
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36

--12345
Content-Disposition: form-data; name="description"
Content-Type: text-plain

my description

--12345
Content-Disposition: form-data; name="file"; filename="image.jpg"
Content-Type: image/jpeg

<human unreadable image data>
If we see how this HTTP call works, we could assume we could make an API call where we send JSON and a file upload:
POST /api/example HTTP/1.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Accept-Encoding: gzip, deflate, br, zstd
Accept-Language: nl-NL,nl;q=0.9,en-US;q=0.8,en;q=0.7
Cache-Control: max-age=0
Connection: keep-alive
Content-Length: 127175
Content-Type: multipart/form-data; boundary=12345
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36

--12345
Content-Type: application/json

{"description":"my description"}

--12345
Content-Disposition: form-data; name="file"; filename="image.jpg"
Content-Type: image/jpeg

<human unreadable image data>
Great! Case closed! Except that it does not work like that! Symfony and Laravel both use the PHP superglobals for making the request object. The only way we would be able to parse this is if we write our own request parser with the special php://input stream wrapper. But according to the documentation php://input is not available in POST requests with enctype="multipart/form-data" if enable_post_data_reading option is enabled.

OpenAPI spec

Another reason to not follow this path is that we can not specify this request in OpenAPI either. We can however specify the multipart/form-data like this by putting all our form fields in a form field called "form".
POST /api/example HTTP/1.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Accept-Encoding: gzip, deflate, br, zstd
Accept-Language: nl-NL,nl;q=0.9,en-US;q=0.8,en;q=0.7
Cache-Control: max-age=0
Connection: keep-alive
Content-Length: 127175
Content-Type: multipart/form-data; boundary=12345
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36

--12345
Content-Disposition: form-data; name="form"
Content-Type: application/json

{"description":"my description"}

--12345
Content-Disposition: form-data; name="file"; filename="image.jpg"
Content-Type: image/jpeg

<human unreadable image data>
This piece of code works as intended, except we receive all our form fields as form, so instead of 'description' we need to send form[description]. This is however already the case with apie/cms, so that was not bad. The only issue I found is that Symfony/Laravel request object discards the content-type "application/json" from the JSON part and I need to manually decode the JSON in "form" as it will return my json as plaintext. It does work perfectly in an OpenAPI spec and is also properly being sent by Swagger UI that renders the testpage for OpenAPI:
The only thing we have to aware is that Symfony and Laravel will both ignore the content-type header for the form and you need to manually JSON decode the form. I have not found a way to read this content type header.

CSRF Protection

Since our REST API call with a file upload can be called with a <form> tag, we are vulnerable for CSRF if we are also using session cookies. So what is a CSRF? A CSRF is a security vulnerability where you try to submit a form to a different website secretly. Since you are logged in in the other website with a session cookie, you could do some very dangerous stuff without the user knowing it. For example if a bank has a CSRF vulnerability and I know you are logged in I could make a script like this to send me money without the user knowing money from his bank account was transfered.

<div style="display:none">
    <iframe name="iframe" />
    <form id="example" action="https://examplebank.com/form-submit/send/money" method="post" target="iframe">
        <input type="hidden" name="to" value="NL99000BANK000123456">
        <input type="hidden" name="amount" value="999EUR">
        <input type="submit">
    </form>
    <script>
        document.getElementById('example').submit();
    </script>
</div>

We would need to have a CSRF ajax action (which is considered a security risk) and add it to the file upload call, but there is a better solution. A samesite session cookie could help, but older browsers do not support it and are still vulnerable. The simplest solution that would also work is adding a custom hidden X-NO-CSRF header which can only be sent as an Ajax call. A form submit can not send this value ever, so the only way it can be used is in an Ajax call. Ajax calls are already protected with CORS. If the header is missing you still need to provide a CSRF token.

File uploads in Apie

File uploads in Apie are not so hard as they are in for example Api Platform. All I do is give a resource a Psr\Http\Message\UploadedFileInterface typehint and Apie does all the magic mentioned above. The multipart one will require a AllowMultipart attribute on the class or the file uploads will be send by JSON only.


use Apie\Core\Attributes\AllowMultipart;
use Apie\Core\Entities\EntityInterface;
use Psr\Http\Message\UploadedFileInterface;

#[AllowMultipart]
class ExampleFileUpload implements EntityInterface
{
    public function __construct(
        public UploadedFileInterface $file,
        public NonEmptyString $description
    ) {
        $this->id = ExampleFileUploadIdentifier::createRandom();
    }
    
    public function getId(): ExampleFileUploadIdentifier
    {
        return $this->id;
    }
}


Internally Apie uses a class that implements UploadedFileInterface called Apie\Core\FileStorage\StoredFile which also offers some methods I need internally for Apie. For example I need a storage path to store it to a storage provider (S3, local file, inline in database), but I also need an index, so I could search for file contents. I'll cover a new article in the future about indexing files.

Conclusion

File uploads are complex because they work differently from regular API calls, but with a little bit of investigation they can be implemented in an API call very easily.



Comments