fosstodon.org is one of the many independent Mastodon servers you can use to participate in the fediverse.
Fosstodon is an invite only Mastodon instance that is open to those who are interested in technology; particularly free & open source software. If you wish to join, contact us for an invite.

Administered by:

Server stats:

9.8K
active users

#regex

2 posts2 participants0 posts today

My recent obsession with the #OscarMamen travel logs of his journey(s) to & time in #Mongolia #BogdKhanate, I wrangled data from a photo archive database.
For unknown reasons, the database doesn't have a field for "date". All date info is stored alongside content descriptions of the photos and their location in the physical archive in the free-text field "motive description".
To work through 7.500 photos + matching them to log entries, I re-taught myself #regex & @OpenRefine
Can recommend!

behold: your new favorite cli too for batch renaming of files and directories via #regex and do it fast *and* safe. i'm very confident.

github.com/shenwei356/brename

also available in #macOS and #linux #homebrew as `brename`

(macOS users that like a GUI: i also use Name Mangler and it lets you 'playground' various methods easily too.)
github.com/shenwei356/brename

A practical cross-platform command-line tool for safely batch renaming files/directories via regular expression - shenwei356/brename
GitHubGitHub - shenwei356/brename: A practical cross-platform command-line tool for safely batch renaming files/directories via regular expressionA practical cross-platform command-line tool for safely batch renaming files/directories via regular expression - shenwei356/brename

I still can't believe that most programming systems we use today are preoccupied with numbers. AFAIK, half of (R5RS?) #Scheme standard is numbers and operations on them. Same for #C, #CommonLisp, #Java—ten different types of numbers and huge libraries for them.

Humans think in images and words. Structured text-oriented languages feel like a much better fit for everyone not corrupted by C. Yet we have little to no popular attempts in that space. Structured Regular Expressions didn't catch up; #ed1 and #awk are considered mere #regex automation tools. Modal and the term rewriting systems have their Merveilles Town, but not much beyond. sh/#bash and the like are quite successful, but aren't considered real programming languages either.

Why.

Ah yes, the age-old tale of Regex: the #mystical, misunderstood #ogre of the #coding world 🤖. The article bravely ignores 90% of its complexity, claiming if you squint just right, it's actually a friendly #garden #gnome 🧙‍♂️. Just remember, if your regular code takes days and your #regex takes minutes, maybe it's time to 🛠️ refine those coding #skills.
timkellogg.me/blog/2023/07/11/ #Developer #Humor #HackerNews #ngated

timkellogg.meRegex Isn't Hard

Whoever uses #regex should know about this invaluable tool:

regex101.com/

I consider myself a regex expert, and still every now and then I have cases which I can't figure out myself. This tool has never let me down so far... You can of course configure it to operate according to most of the important #regexp "flavors"...

regex101regex101: build, test, and debug regexRegular expression tester with syntax highlighting, explanation, cheat sheet for PHP/PCRE, Python, GO, JavaScript, Java, C#/.NET, Rust.

Do I have #regex experts among my followers: echo "123.4506000" | sed -E 's/(\.[0-9]*[1-9])?0+$//; s/\.$//' is intended to remove trailing 0s when its a number with a decimal point. But when there are no cifers behind the decimal point other than 0s, the whole number shall be stripped of the point and the 0s. What are I am doing wrong? Sharing appreciated.

🚀 Behold the epic tale of Janet's #PEG #module, where the author heroically excludes regular expressions like they're yesterday's news. 💥 Marvel at the labyrinth of #parsing magic that claims to be more readable, but only if you have a PhD in arcane text processing. 📜✨
bakpakin.com/writing/how-janet #Janet #readability #textprocessing #regex #HackerNews #ngated

bakpakin.comHow Janet's PEG module worksAn in depth explanation of pegs and how they work.

You know how you can understand something like Regular Expressions (Grep) well enough to write really complex statements that precisely match the nuanced text patterns you’re hoping to revise, but then 6 months later you look at this RegEx you wrote and can’t parse it at all? Like, literally no idea how it works? As in it-would-be-faster-to-recreate-it-from-scratch-than-to-understand-it-enough-to-change-it-even-slightly?

I feel like that all the time.

[FR]
Arrivée ici début 2025, big up aux @admin de Piaille.fr ! #introduction :
Tombée dans la marmite #OpenSource en 2000, je me nourris de commandes #bash. Fichiers texte, #grep et ses jolies #regex, #ansible, #git, #greasemonkey, les tests auto et la supervision sont tes amis.
Cordes frottées, grattées et frappées, sons soufflés, chantés ou beatboxés, sons électro ou scratchés me touchent. Rien de tel qu'une bonne soirée à jammer / à enregistrer pour un beatmaker / à débarquer sur scène pour accompagner quand il manque un instrumentiste / à repiquer des morceaux entiers sur papier à l'ancienne / à improviser avec les enfants
Engagée #AMAP et pro #CNV

Replied in thread

@stilgherrian it’s been a long time since I’ve admin’d a mail server in anger (always in anger… 😜) but I have much recent experience when trying to interact with MTAs & other systems when using plain old plus-addressing that worked perfectly well for decades back when crusty greybeards (👋 honorary greybeard) ran the global infra

the most common / painful variants:

  • sign-up form accepts plus-address email & “completes” the process, but confirmation email never arrives (silently dropped somewhere in the pipeline, I have no way of knowing exactly where) 😩

  • sign-up form has a (crappy) client-side “email address validation” with varying forms of incorrect / incomplete understanding of relevant RFCs, which can almost always be traced back in their web page code to crappy #regex that doesn’t match the full set of RFC-permitted chars 🤬

it’s been so painful that I’ve stopped using plus addressing all together & just reverted to a catch-all email config on my domains (which fortunately my mail provider supports well, I can reply from whatever email something comes in to), as well as masked emails using the provider’s domain depending on use case 💁‍♀️

Replied in thread

@dansup @Gargron Also, the Mastodon filter implementation is anything but powerful since the capability for #RegEx has been removed.
Many things cannot be filtered anymore at all, others need tons of primitive filters what could be accomplished with a single RegEx before.
I still don’t get the reason behind tha removal. It just does not make sense.
#LeSigh

Sauver des données embarquées à dos d'outarde
linuxfr.org/users/siltaar/jour

Ou comment l'association @GEBULL, petit GUL de province, a contribué à une étude scientifique.

À la fin du mois de novembre 2021, une balise de suivi d’oiseau nous a été confiée par le Groupe Ornithologique des Deux-Sèvres. Muette depuis 4 ans sur le dos d’une outarde canepetière, perdue pour la science mais recapturée dernièrement, nous avions pour consigne d’en extraire coûte que coûte les dernières données relevées.

linuxfr.orgSauver des données embarquée à dos d'outarde - LinuxFr.orgSauver des données embarquée à dos d'outarde
#LinuxFr#GUL#GODS

Anybody else download like 1000 TikToks and max out their phones storage and their back up solution? No...yeah...Me Neither.

But if someone did, here's a way to solve it quickly on Android devices.

Since #TikTok names all its video files 32bithexadecimalvalue.mp4, we can us a little #grep and #regex along with #Termux to sort through the #Android #camera roll and delete all the corresponding files.

justinmcafee.com/posts/2025/so

justinmcafee.comJustin McAfee - So You Downloaded a Thousand TikToks

Im trying to provide aliases for #regex definitions so that humans would be able to understand things better scanning my coding.

However, an identifier as terse as `/[^\s\t_-\/\.=<>:]+/` becomes hideously long to describe in english descriptors.

Is there a midpoint reference or shorthand that could serve as a compromise?

I may just provide a vague numbered reference as a hack - but this obviously is ineligent and a recipe for bugs (should I label something else with the same nomenclature)