Indexing HTML Attributes and Unique Indexes
With Custom Elements you have attributeChangedCallback
which reacts after
observedAttributes
returned attributes, and I believe you'd like to use
that to create getters and setters out of the box.
I don't think DOM specific class fields/syntax will ever land in JS itself, but I can suggest you looking at most handy custom elements patterns in here: gist.github.com/WebReflection/ec9f6687842aa385477c4afca625bbf4
About being unique, you can always document.querySelector('[attribute="' + value +'"]')
and, if not null, throw an error 'cause already live on the
DOM.
However, IDs are the most "unique" thing you can have, even if 2 IDs with same content are still allowed love on the document.
If you look for an easy way to have unique IDs, remember you can start from
let id = Math.random()
and do ++id
any other time to have a new, rarely
clashing, unique name. Prefix it with the nodeName
and see you've already
all uniqueness you need for you custom elements, since you can't define two
custom elements with the same name anyway (yet, unless scoped, but that's
another story).
live *
Thanks for the link. My current approach is similar to what you and the article describe. Maybe it’s just the old DBA in me, but even when I narrow my parameters (node.querySelector(“[…]”))
it feels like I’m doing a lot of “full table scans” when I would want to index some of the “columns”. I’m sure the engines are pretty optimized for this though.
About being unique, you can always
document.querySelector('[attribute="' + value +'"]')
This code is vulnerable to CSS injection, input values shouldn't be
inserted raw into queries!
You can use CSS.escape
to sanitize.
Thanks for the link. My current approach is similar to what you and the article describe. Maybe it’s just the old DBA in me, but even when I narrow my parameters
(node.querySelector(“[…]”))
it feels like I’m doing a lot of “full table scans” when I would want to index some of the “columns”. I’m sure the engines are pretty optimized for this though.
What do you mean by "Unique Indexes" (specifically unique within the scope
of an HTML document
indexes of elements are already unique) and "full table scans" (relevant to CSS
specificity; that is, what code are you using now that is not capable of
selecting specific elements, and attribute values)? CSS selectors can
select any element by a variety of attribute name and value combinators,
including using data-*
attributes and Microdata. It is the
responsibility of the developer to create unique names and values for HTML
elements - and to not create duplicate id
s. Is the HTML being used
dynamic or static? Whether the HTML is dynamic or static Map
and
WeakMap
can be used for "unique" key-value pairs of HTML elements,
and HTML element attributes and values.
Full Table Scans and Unique indexes are database concepts (that's the DBA reference). When a database searches for a record based on a column value, it might look at every record in the table to find the matches - scan the entire (full) table, in the order the records were inserted or stored. To speed this up, we can create indexes on table columns or column groups. These are like ordered maps or hash tables. To find a record, more efficient searches can be done against the indexes to find the records. Indexes can also act as constraints. A "Unique Index" is a constraint that checks a table to see if a value exists before inserting it in the table and adding it to the index. Indexing has a trade-off. It slows inserting, but improves searching. While understanding that databases and browsers are worlds apart, a foundational part of database engines is searching, just like it is in DOM manipulation. Indexing can provide orders of magnitude performance improvements when reading/searching in databases. It seemed worth seeing if the concept translated across technologies.
Without any optimizations, an attribute search on a document would look at each node, and then at each attribute of the node to find a match - Full Table Scan. This makes searches very slow. At an absurd extreme, we could index everything, making startup very slow and eating memory, but making some searches very fast. The balanced approach is to implement "indexing" ourselves (using any of the mentioned approaches) to get the best level.
About the code/HTML, it is dynamic and real-time. It is loaded over WebSockets, and the elements are talking to the backend in real-time over the sockets. I'm using an original (Trygve Reenskaug) MVC approach. Essentially, each Web Component is an MVC component, with the HTML/elements and code accessed only through the controller. I am looking at the incoming code for cases where several searches ae being performed on the same attribute (or element). I give these a generated id
, create indexes on them, and expose them as properties on the controller. The underlying framework uses a set of common attributes that are searched on a lot, but only for a small set of elements. These are also indexed. So at the cost of slower startup (offset to some degree by doing some of this in a Web Worker and/or server-side), I can read and write "Form Fields" quickly.
Many language features are implemented to wrap or optimize common or repetitive use cases, or to move code to a more efficient part of the architecture. Indexing can do both. Without doing things server-side or in Workers, the indexing consumes UI cycles. Adding an indexing "hint" could allow all or part of this code to be moved back into the "system" or C++ layer. (e.g., into querySelect
internals supported by low-level map stores) Or to parsing (like I'm doing), taking some of the repetitive work off the UI and developers hands.
If the HTML elements have a unique id
set there is not search to
perform (document.getElementById("id")
), correct?
Form fields can be created, set and changed using FormData
objects
without using HTML elements at all.
Still not gathering what is meant by unique indexes.
it's meant there couldn't be two indexes with the same value, but even IDs can be duplicated on a page (it's not suggested or anything, but nothing prevents you to do that)
to be honest, since IDs already cover the whole story (IDs are accessible
even via globalThis/window, no need to query the document) I guess this
whole story is about having el.uid
, as opposite of el.id
, so that a
uid
cannot be duplicated (it throws if it is), and
document.uid[unique-uid-value]
would return, without querying, the live
node (if any)
however, I think this whole discussion in here makes no sense, as JS itself has nothing to do with HTML 🤷♂️
@Andrea Giammarchi<mailto:andrea.giammarchi at gmail.com>, While the connection is secondary, HTML often serves as the specification for the creation of JS objects. And while it could be considered a sub-set, JS is full of HTML related features - HTMLElement
for one. Thing is, if you are programming in JS for browser applications, you’re dealing with HTML-adjacent JS at some point. What I’m trying to do, though, somewhat supports your point. I see a lot of higher-level code manipulating HTML tags, which feels really wrong. Even dealing with HTMLElement
in higher-level code doesn’t seem to make a lot of sense. I’m trying to encapsulate the elements and tags, and move that point as far into the background as I can.
@guest271314<mailto:guest271314 at gmail.com>
If we think of Indexes as a type of key-value pairs, a “regular” Index allows duplicate keys, and a Unique Index requires unique keys. Indexes are always sorted on their keys. So in this case, when the index is built, it creates k-v pairs of attributeName-elementId, ordered by attributeName. To get all elements with a specific attribute, we just find the first one with that key, and keep reading -getElementbyId(elementId)
- until the key changes.
You’re right about id
. I’m converting generic, multi-instance template “tags” into elements with id’s, so I can access them directly without searching. Just using getElementById
. The template as been “localized” per instance, and encapsulated behind a controller. I want to avoid dealing with HTML, and even more HTTP verb related things like Form
and FormData
and just deal with general JS objects, so I use Models instead of things like FormData
.
So for example, the business perspective of a ”Person” has “Age” data. A page may display multiple people at once.
<custom-person id=’person1’>
<custom-person id=’person2’>
The goal is to get from the source tag to non-html/element related JS as soon as possible.
The template behind this might look something like
<framework-container>
<input prop=’name’ />
<input prop=`age` />
</framework-container>
When connectedCallback
runs, it creates a View
using the template
<sometag>
<input id=’person1_name` />
<input id=`person1_age` />
</sometag>
<sometag>
<input id=’person2_name’ />
<input id=’person2_age’ />
</sometag>
A Model
class person{
name;
age;
}
And a dynamically configured Controller
and instance. A base Person class contains common functionality.
class Person1 extends Person{
get name(){ return document.getElementById(‘person1_name’)
…
get model(){ return this.populateModel();}
}
self.controllers.add(new Person1());
Now I don’t need to deal with any HTML/element or “tag hierarchy” related JS. I pretty much abstract out the HTML and HTMLElement pieces when the Custom Element is initially loaded.
const personAge = self.controllers.person1.age;
At a lower level, I can create attribute related properties using the dynamically assigned element id.
<input prop=’name’ attrib=’style.color’ />
This would end up creating a property or methods on the controller that allows me to not have to deal with styles and CSS classes directly, and even constrain values.
self.controllers.person1.name.color = “red”;
So the whole index thing started when I was loading/parsing dynamic html/JS code and searching for prop
and attrib
repeatedly. If I know I’m going to be searching on an attribute a lot, maybe I could give the parser/engine a hint it could use to optimize that search.
while it could be considered a sub-set, JS is full of HTML related
features - HTMLElement
for one.
HTMLElement is defined by the living DOM standard (WHATWG) html.spec.whatwg.org/multipage/dom.html#htmlelement
it has nothing to do with JS.
JS is a general purpose programming language that implements ECMAScript standard, which on the Web gets enriched with some functionality, while on NodeJS it gets enriched with some other (and indeed HTMLELement doesn't exist there).
In GJS (Dekstop UI) it has other features too, so asking in a JS related mailing list to bring in something strictly DOM related (whatwg) is not appropriate.
Historically speaking, the only thing that went in strictly DOM related where things like String.prototype.blink methods and others, but today JS is really not Web based anymore, even if Web is one of its primary goals (but then again, with WASM around, any programming language can target the Web, so you want this proposal to land in WHATWG, not here).
Sorry. My confusion.
I want to avoid dealing with HTML
Using HTML is part of premise of the proposal, correct?
Am still not sure what the actual requirement is.
If the requirement is to prevent duplicate values being input by the user you can utilize
pattern
attribute of <input>
with a RegExp
which matches the current values of <input>
elements, oninvalid
and checkValidity()
which will provide the functionality of the
value
attribute of <input>
and <select>
elements being unique as to a <form>
element.
If there is no user input there should not be any issue creating unique
key-value pairs using Map
; WeakMap
; Set
, or other means.
and even more HTTP verb related things like
Form
andFormData
and just deal with general JS objects, so I use Models instead of things likeFormData
.
Am not certain what a "Model" is.
A FormData
object can be serialized and represented in various manners; including as an array of JavaScript arrays of key-value pairs that can be adjusted to the exact keys and values required [...formData]
; multipart/form-data; etc. An earlier post mentioned forms at
I can read and write "Form Fields" quickly
Is user input involved in the procedure relevant to "Indexing HTML Attributes and Unique Indexes"?
What are you trying to achieve that you are not able to with the current code?
What do you consider to be "general JS objects"?
You'd have better luck asking for this feature in discourse.wicg.io. ES Discuss is about the JS language itself and the related ECMAScript spec, not the Web APIs that are implemented in most browsers, usually separately to the JS implementations themselves.
Isiah Meadows contact at isiahmeadows.com, www.isiahmeadows.com
I've been working with
Custom Elements
and I'm writing a lot of code against tag attributes. In some cases, I want the attribute values to be unique on a page (likeid
). It got me wondering about how the engines handle attribute based searches, and if indexing (with unique/distinct options) would provide value. I also find myself writing a lot of boilerplate getters/setters for attributes in the elements. Attribute handling could be improved by adding some additional support with something like anattrib
feature. This would be similar toget
orset
in use.This would create the attribute
my-attribute
on the tag and element, and also generate a getter and setterThe
index
flag it would tell the engine it should create a map/hash to improve search optimization for heavily searched attributes. Thedistinct
flag would indicate that all values for that attribute within context (e.g., document) should be unique. This might be used primarily by IDE's to generate warnings.