[Question] Sed and regex string manipulation
-
[email protected]replied to [email protected] last edited by
skip the following substitute command if the line contains an http link in markdown format
Why you assume there's only one link in the line?
-
[email protected]replied to [email protected] last edited by
I didn't test this, but it will change the whole URL while changes are only needed in its fragment component (after the first
#
). -
[email protected]replied to [email protected] last edited by
Obligatory regex was a mistake post
-
[email protected]replied to [email protected] last edited by
Hmm, OP mentioned "Only edit what’s between parentheses" - don't see anywhere that whole URL shouldn't be changed...
-
[email protected]replied to [email protected] last edited by
Paths are constant, only anchors are generated by forgejo.
-
[email protected]replied to [email protected] last edited by
Why you assume there's only one link in the line?
They did not want external (http) links to be modified as that would break it:
-
[Example](https://example.com/#Some%20Link)
-
[Example](https://example.com/#some-link)
I compromised by thinking that it might be unlikely enough to have an external http link AND internal link within the same line. You could probably still do it, my first thought was
[^h][^t][^t][^p]
but that would cause issues for#ttp
and#A
so i just gave up. Instead I think you'd want a different approach, like breaking each link onto their own line, do the same external/internal check before the substitution, and join the lines afterward.Also, you perform substitutions in the whole URL instead of the fragment component
That requirement i missed. I just assumed the filename would be replaced the same way too Lol. Not too hard to fix tho
-
-
[email protected]replied to [email protected] last edited by
Don't reinvent the wheel! https://github.com/jgm/pandoc
-
[email protected]replied to [email protected] last edited by
Not home so I can't try it but do you need to be so specific to match the whole markdown syntax?
You might be able to get away with
s/#(\w+%20)*\w+\.\w{2,3}/\L&/g; /#(\w+%20)*\w+\.\w{2,3}/ s/%20/-/g
basically, matching #this%20is%20LIKELY%20a%20link.md
as opposed to matching whole markdown linklowercasing that entire match,
then on a search matching stuff that looks like that, replace the %20 with a hyphen (combined into a single sed command). this only fails when an http link falls within the same line as a markdown hyperlink -
[email protected]replied to [email protected] last edited by
Hello !!!
Sorry for the very late response had something else to do. I will read everything carefully and response to every post I also thought about it over night and I think that sed and and regex wasn't the best option here (as other have mentioned it).
I think a python script or bash (as you have mentioned it a bit later ) would be a better way. I'm sorry that I put you through all of this... wrong tool for the job :s.
-
[email protected]replied to [email protected] last edited by
First, thanks again for sharing your knowledge with me I really appreciate the time/effort you took to write all of this. I know those are a lot of thank you but I'm really grateful for all of this, this is very valuable information I will keep in my knowledge base. It's really time I learn proper bash/python/Pearl? scripting with all those tools (grep/sed/regex).
Second, YOU MISSED A DAMNED parentheses you fool xD !
mdlinks="$(grep -Po ']\((?!https).*\)' ~/mkdn)"
Took me some time to figured it out with a very non informative errorbashscript.sh: line 8: unexpected EOF while looking for matching "'
but as expected it works !From ------- [Just a test](#Just%20a%20test.md) [Just a link](https://mylink/%20with%20space.com) %20 To ------- [Just a test](#Just-a-test.md) [Just a link](https://mylink/%20with%20space.com) %20
Next to show you my appreciation and not to take everything for granted and being spoon feed for everything, I tried to find a solution myself for something else, I will try to explain the best I can how I solved it.
From ------- [Just a test](Another%20markdown%20file.md#Hello%20World) To ------- [Just a test](Another%20markdown%20file.md#hello-world)
The part before the hashtag needs to keep it's initial form (it links to the original markdown file). So, because just playing around with Pearl and regex (which doesn't end well doing this blindly without the proper knowledge) I did some simple string manipulation. It's not very elegant but does the trick, thankfully to your well written breakdown.
- I printed out the $mdlinks variable just to see what it prints out
- Copied and changed your Pearl/regex to find the first hashtag (#) and save it into a new variable ($mdlinks2)
- Feed your $mdlinks variable into my new Pearl/regex
- Feed my new variable into done? (I'm a bit confused here but okay xD)
#! /bin/bash mdlinks="$(grep -Po ']\((?!https).*\)' "/home/dany/newtest.md")" echo $mdlinks mdlinks2="$(grep -Po '#.*' <<<$mdlinks)" echo $mdlinks2 while IFS= read -r line; do dashlink="$(echo "$line" | sed 's|%20|-|g')" sed -i "s/$line/${dashlink}/" "/home/dany/newtest.md" done <<<"$mdlinks2"
Yes, not very elegant but It's the best I could do currently However, I still got a YES effect
To answer your question:
Quick question as I’m working on this, in the new link example, is the BDMV and other capitalized text in this link supposed to be converted to lowercase, or to remain uppercase?
As you can see in my string manipulation above, the part before the # needs to keep it's original form (Sorry wasn't aware of this before working with the original files) I solved it with some string manipulation as shown above.
I'm a bit tired from all this searching/trail&error, tomorrow I will try to wrap everything up and answer your post below ! Also, I need to clean up the mess I made in my home directory xD.
Thanks again for your help ! Have a good night/day !
-
[email protected]replied to [email protected] last edited by
Oh god! I'm sorry about the missing
)
! I must have dropped it when copying things from my notes over to post the comment! (≧▽≦)Despite my error, I'm glad it worked, and even happier that you were able to take what we had worked out and modify it further to fit your other requirements. It's fun helping each other out, and it's also great learning.
I learn by problem solving, so I've got all my notes from working on this in my knowledge base as well!
In the future, feel free to ping me if you need help with other linux/cli/bash things. As I've mentioned before I'm no expert, but happy to help where I can.
-
[email protected]replied to [email protected] last edited by
No apologies necessary!