0 votes
asked in MicroStream for Java by (120 points)
edited by
Suppose I have a existing HashMap<Integer,Integer> map, and I do a map.put(1,3). Does the whole hashmap have to be stored?  And will it therefore copy and restore the entire hashmap?

Or put another way suppose I have a graph

class Graph

HashMap<Integer,Integer> adjacencyMap



An insert of the graph will update the adjacenyMap.  Does the entire map need to be stored?

1 Answer

+1 vote
answered by (3.4k points)

You are right, after adding elements to the HashMap you need to store the entire HashMap, the persisted HashMap will also be restored on the next Microstream startup.

Storing a previously stored map again after it has been modified will write the map itself and all new Objects in the map. Objects that are stored already will not be persisted again.

If you put a lot of elements into the map you are not forced to store the map after every single map.put(..) operation.


Best regards
commented by (120 points)
Based on what you are saying  the HashMap needs to be re-created after each store operation.  So that means for example that representing a Graph using an adjacencyList (HashMap<V,List<V>>) would be too inefficient.

Do you have any suggestions?
commented by (3.4k points)

The performance impact is not that big as you suspect. Microstream does not alter the HashMap instance in any way when it stores the HashMap. Therefor there is no need to recreate it after storing if you keep a reference to the map in your program.

Microstream will only recreate the HashMap during the Microstream startup process (if there is already a persisted one). Starting Microstream is usually done once in the application`s lifetime.


Best regards
commented by (120 points)

Suppose I do a new put on the HashMap which inserts a new entry into the map.  If I store the map after that is the entire map copied in the store? So for example if the map contains 10 million key-values the store will have to recreate a map of 10 million key-values?


commented by (3.4k points)

By default, Microstream uses a lazy storing strategy that means that referenced objects that have been stored previously will not get stored again.

In your example this would result in rewriting 10 million IDs for the keys and 10 million IDs for the values. The already persisted key and value objects will not be persisted again.

The new data also requires one ID for the new key and one for the new value. Additionally, the new key and value objects are stored too.

In total: 20 Million + 2 ids (which are long values) and the new Key and Value
commented by (120 points)
edited by
That is very inefficient for just one or even a handful of updates!
commented by (3.4k points)
Unfortunately, this is true for adding only a few elements to large collections and then storing those collection.

A general advice would be to avoid very large collections by splitting those into several smaller collections, but this may not be possible in all cases. On example would be to split a large list of sales into lists for every month.

If you could provide more details on your current task, e.g. code snippets I would try to give you more detailed support.


Best Regards
Notes: Every question must be a separate forum post. Headline: Formulate your question shortly and precisely. Thank you!
Powered by Question2Answer