I am trying to identify which files in my project have incorrect headers. The files all starts like this
---
header:
.
.
.
title:
some header:
.
.
.
more headers:
level:
.
.
.
---
Where . . . only represents more headers. The headers contains no indentation. Using the following expression I have been able to extract the YAML header from every file.
grep -Przo --include=\*.md "^---(.|\n)*?---" .
Now I want to list the incorrect YAML headers.
- Every YAML header must have a
title: some text - Every YAML header must have
language: [a-z]{2} - It must either contain a
external: .*orauthor: .*. - The placement of
title:,level:,external:andlanguage:varies.
I tried to do something like
grep -L --include=\*.md -e "external: .*" -e "author: .* ."
However the problem with this is that it searches the entire file, not just the YAML header. So I guess solving the issues above boils down to how I can feed the YAML header result from my previous search into grep again. I tried
grep -Przo --include=\*.md "^---(.|\n)*?---" . | xargs -0 grep "title:";
However this gave me an error "No such file or directory", so I am a bit uncertain how to proceed.
Examples:
---
title: Rull-en-ball
level: 1
author: Transkribert og oversatt fra [Unity3D](http://unity3d.com)
translator: Bjørn Fjukstad
license: Oversatt fra [unity3d.com](https://unity3d.com/learn/tutorials/projects/roll-ball-tutorial)
language: nb
---
Correct YAML, has an author, language and title.
---
title: Mini Golf
level: 2
language: en
external: http://appinventor.mit.edu/explore/ai2/minigolf.html
---
Correct YAML, has a title, language, and external instead of author.
---
title: 'Stjerner og galakser'
level: 2
logo: ../../assets/img/ccuk_logo.png
license: '[Code Club World Limited Terms of Service](https://github.com/CodeClub/scratch-curriculum/blob/master/LICENSE.md)'
translator: 'Ole Andreas Ramsdal'
language: nb
---
Incorrect YAML header, missing author.