A young computer scientist and two colleagues show that searches within data structures called hash tables can be much faster than previously deemed possible.
-
-
The paper was published by IEEE and with professors as co-authors. Only the second author is a student. And I wouldn't dismiss it out of hand like that because of a magazine article. Students come up with breakthroughs all the time.
The paper itself says it disproves Yao's conjecture. I personally plan to implement and benchmark this because the results seem so good. It could be another fibonacci heap situation, but maybe not. Hash tables are so widely used, that it might even be worthwhile to make special hardware to use this on servers, if our current computer architecture is only thing that holds back the performance.Edit: author sequence
-
And yet all that pales in comparison to using react (or whatever framework) over vanilla js. Enter McMaster-Carr.
-
yupyup, just send HTML over the wire. it's fine.
-
-
Depends on the implementation, but most will, yes. There are other forms of associative arrays, like trie or binary tree, but hash is the most common.
-
-
...before the program even exists...?
-
So... databases? Especially in data centers? Still a nice boost in that case
-
Sorry to be blunt, but you don't know what you're talking about.
-
Everyone prepare for your minds to be blown:
-
Hash tables are used in literally everything and they always need to minimize resizing because it’s a very expensive operation.
I suspect this will silently trickle into lots of things once it gets picked up by standard Python and JavaScript platforms, but that will take years.
-
If you use a hash table, you search every time you retrieve an object.
If you didn’t retrieve, why would you be storing the data in the first place?
-
In the article, x is not the size of the hash table, it is the inverse of the table’s filling fraction. A 1000-element table that is 90% full has x=10, N=1000.
Since they’re not discussing scaling of data sizes, would be confusing to use O(N) notation or people would make that assumption.
-
They managed to get the time to O(1)
-
The article is discussing how to reduce the constant time factor which depends on the filling fraction, which is a speed-memory tradeoff when creating the hash table.
The innovation described allows for the use of fuller tables which are resized less frequently, or faster insertion/retrieval for the existing filling fraction.
-
I know that, but phrased that way it sort of sounds like they're iterating over the entire thing.
-
How much do you think this speeds up an average program? Enlighten us with your knowledge.
-
I wrote my comment not to antagonize you but to point out that you're asking the wrong questions. I failed to articulate that, and I'm sorry for being harsh.
Your prior comment indicated that you have used hash tables in Java, which were very fast. You said that your program accessed the hash tables, but did not "search" the table. These operations are the same thing, which led me to believe you're out of your depth.
This last comment asks me how much this paper's contribution speeds up an average program. You're asking the wrong question, and you seem to be implying the work was useless if it doesn't have an immediate practical impact. This is a theoretical breakthrough far over my head. I scanned the paper, but I'm unsurprised they haven't quantified the real-world impact yet. It's entirely possible that despite finding an asymptotic improvement, the content factors elided by the big O analysis are so large as to be impractical... or maybe not! I think we need to stay tuned.
Again, sorry for being blunt. We all have to start somewhere. My advice is to be mindful of where the edge of your expertise lies and try to err on the side of not devaluing others' work.
-
the reason it confused me is because the college student was clearly using the algorithm to accomplish his task, not just theoretically designed. So it didn't seem to be a small improvement that would only be noticeable in certain situations.
I'm not smart enough to understand the papers so that's why I asked.