To store files larger than 16Mg, but also for storing any files for which you want access without having to load the entire file into memory, MongoDb has the type GridFS which is managed by Spring with the class GridFSDBFile. Many of my files were larger than 16Mg so I went with this type.
- Read - I used an id as the filename:
- Write - I check if the file is not already in the DB because I don't want to replace or re-download it. Just be aware that checking if the file exists with the exists query takes an average of 0.002ms vs the findById which is around 0.007ms (3 times more per file downloaded):
- Download - I am downloading the PDF files from the web using RestTemplate. I found RestTemplate method exchange was not following redirections and implemented it with a while. The complete code:
public GridFSDBFile findPDF(String id) { log.debug("Reading PDF filename: " + id); Query query = new Query(); Criteria criteria = Criteria.where("filename").is(id); query.addCriteria(criteria); GridFSDBFile dbfile = gridOperations.findOne(query); return dbfile; }
... Boolean downloaded = this.existsPDF(id); if (!downloaded) { // if is not already in DB download GridFSFile file = gridOperations.store(new ByteArrayInputStream(pdf), id); } ...
public Boolean existsPDF(String id) { log.debug("Reading PDF filename: " + id); Query query = new Query(); Criteria criteria = Criteria.where("filename").is(id); query.addCriteria(criteria); Boolean downloaded = mongoTemplate.exists(query, "fs.files"); return downloaded; }
RestTemplateTimeout restTemplate = new RestTemplateTimeout(); restTemplate.getMessageConverters().add( new ByteArrayHttpMessageConverter()); HttpHeaders headers = new HttpHeaders(); headers.setAccept(Arrays.asList(MediaType.APPLICATION_OCTET_STREAM)); ResponseEntityresponse = restTemplate.exchange( url, HttpMethod.GET, entity, byte[].class); log.debug("Content type for the url: " + response.getHeaders().getContentType() + " location :" + response.getHeaders().getLocation()); while (response.getStatusCode() == HttpStatus.SEE_OTHER || response.getStatusCode() == HttpStatus.FOUND || response.getStatusCode() == HttpStatus.MOVED_PERMANENTLY) { log.debug("Content type fo the url: " + response.getHeaders().getContentType() + " location :" + response.getHeaders().getLocation()); response = restTemplate.exchange( response.getHeaders().getLocation(), HttpMethod.GET, entity, byte[].class); } if (response.getHeaders().getContentType().toString() .contains(MediaType.APPLICATION_PDF_VALUE.toString()) || response.getHeaders().getContentType().toString() .contains(MediaType.APPLICATION_OCTET_STREAM.toString())) { log.debug("Downloading file"); if (response.getStatusCode() == HttpStatus.OK) { byte[] pdf = response.getBody(); GridFSFile file = gridOperations.store( new ByteArrayInputStream(pdf), id); if (file != null) { String idPdf = file.getId().toString(); } } }
Comments
Post a Comment